Alright, I'll try next time I'm at work (would be next Friday cause I'm
just a student worker).
Thanks for your great help ;)

Regards,
-- Erik H.


Gal Nitzan schrieb:
> Erik,
>
> I'm not sure cause' I worked with your version long time ago (work with 0.9)
> so I'm not sure I'm right about the "crawl_generate and crawl_parse" folders
> in the segment structure.
>
> However, two days ago I had that same exception when one of my segments was
> missing the parse folder in the segment.
>
> So maybe you need to parse the segments again (bin/nutch parse
> segments/segmentname)
>
> HTH,
>
> Gal.
>
>
>
> -----Original Message-----
> From: Erik Höschler [mailto:[EMAIL PROTECTED] 
> Sent: Friday, January 26, 2007 6:21 PM
> To: [email protected]
> Subject: Re: Problems Searching an Index with Nutch
>
> Ok,
>
> I could not find any crawl_generate or crawl_parse Folder. Also I didn't 
> find Catalina.out on my whole System?!?!
>
> One thing I won't understand is the fact that nutch should create my 
> folder structure. If there is a fault in it, just like
> the missing folders or the 'db' folder which should normally be 
> 'linkdb', how can I fix this. I didn't change anything at
> the structure by my own so it must have been created by nutch 
> directly... Any idea how this could happen?
>
> Thanks for your time ;)
>
> --Erik
>
> Gal Nitzan schrieb:
>
>   
>> Well I guess that db is linkdb for ver 0.7 .
>>
>> Any way there is not much info maybe you can find more info in the
>> Catalina.out ...
>>
>> One more thing to look for just maybe it is the reason (long shut)...
>>     
> check
>   
>> each of your segment folders and verify that it contains all the 5 folders
>> i.e. content,crawl_generate,crawl_parse,parse_data,parse_text
>>
>> HTH
>>
>> Gal.
>>
>> -----Original Message-----
>> From: Erik Höschler [mailto:[EMAIL PROTECTED] 
>> Sent: Friday, January 26, 2007 5:58 PM
>> To: [email protected]
>> Subject: Re: Problems Searching an Index with Nutch
>>
>> Hi,
>>
>> I checked my FolderStructure and everything seems to be correct...
>>
>> :/opt/nutch/crawl.db# l
>> insgesamt 8
>> drwxr-xr-x   3 root root   53 2007-01-19 14:11 db
>> drwxr-xr-x   2 root root 4096 2007-01-19 14:18 index
>> drwxr-xr-x  12 root root 4096 2007-01-26 15:06 segments
>>
>> I'm not sure if I've ever had a linkdb Folder or did you mean the db 
>> folder listed above?
>>
>> Greetings,
>> Erik
>>
>> Gal Nitzan schrieb:
>>   
>>     
>>> Hi,
>>>
>>> I'm not sure but it seems to me you are missing the linkdb and segments
>>> folder. It should be located on the same level as the index folder.
>>>
>>> HTH/
>>>
>>> Gal
>>>
>>> -----Original Message-----
>>> From: Erik Höschler [mailto:[EMAIL PROTECTED] 
>>> Sent: Friday, January 26, 2007 5:04 PM
>>> To: [email protected]
>>> Cc: Erik
>>> Subject: Problems Searching an Index with Nutch
>>>
>>> Hi,
>>>
>>> I'm running Nutch-0.7.2. I created an Index for my local Lan which 
>>> consists of 45.000 Pages.
>>> I can inspect this Index with Luke an everything looks fine. When I try 
>>> to start a search Query with Nutch
>>> I can see the following Exception in my JBOSS Logfile (at the End of the 
>>> Log).
>>>
>>>
>>> //Here I'm redploying the Nutch.war Archive....
>>> 2007-01-26 15:55:06,611 INFO  [org.jboss.web.tomcat.tc5.TomcatDeployer] 
>>> deploy, ctxPath=/nutch, 
>>>
>>>     
>>>       
> warUrl=file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/
>   
>>   
>>     
>>> 2007-01-26 15:55:06,831 DEBUG [tomcat.localhost./nutch.Context] Starting 
>>> tomcat.localhost./nutch.Context
>>> 2007-01-26 15:55:06,832 DEBUG [tomcat.localhost./nutch.Context] 
>>> Configuring default Resources
>>> 2007-01-26 15:55:06,836 DEBUG [tomcat.localhost./nutch.Context] 
>>> Processing standard container startup
>>> 2007-01-26 15:55:06,844 DEBUG [tomcat.localhost./nutch.Context] Setting 
>>> deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web 
>>> Application 2.3//EN'
>>> 2007-01-26 15:55:06,862 DEBUG [tomcat.localhost./nutch.Context] Setting 
>>> deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web 
>>> Application 2.3//EN'
>>> 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Posting 
>>> standard context attributes
>>> 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] 
>>> Configuring application event listeners
>>> 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Sending 
>>> application start events
>>> 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Starting 
>>> filters
>>> 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context]  
>>> Starting filter 'CommonHeadersFilter'
>>> 2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Starting 
>>> completed //Archive successfully loaded...?!?!
>>> 2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Checking 
>>> for 
>>>
>>>     
>>>       
> jboss.web:j2eeType=WebModule,name=//localhost/nutch,J2EEApplication=none,J2E
>   
>>   
>>     
>>> EServer=none
>>>
>>>
>>> //Here I startet a query in my Webbrowser...
>>> 2007-01-26 15:55:53,585 INFO  [STDOUT] 070126 155553 parsing 
>>>
>>>     
>>>       
> file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
>   
>>   
>>     
>>> /classes/nutch-default.xml
>>> 2007-01-26 15:55:53,591 INFO  [STDOUT] 070126 155553 parsing 
>>>
>>>     
>>>       
> file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
>   
>>   
>>     
>>> /classes/nutch-site.xml
>>> 2007-01-26 15:55:53,599 INFO  [STDOUT] 070126 155553 Plugins: looking 
>>> in: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins
>>> 2007-01-26 15:55:53,600 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/clustering-carrot2
>>> 2007-01-26 15:55:53,600 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/creativecommons
>>> 2007-01-26 15:55:53,600 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/index-basic/plugin.xml
>>> 2007-01-26 15:55:53,607 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.indexer.IndexingFilter 
>>> class=org.apache.nutch.indexer.basic.BasicIndexingFilter
>>> 2007-01-26 15:55:53,609 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/index-more
>>> 2007-01-26 15:55:53,609 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/language-identifier
>>> 2007-01-26 15:55:53,609 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/nutch-extensionpoints/plugin.xml
>>> 2007-01-26 15:55:53,612 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/ontology
>>> 2007-01-26 15:55:53,612 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-ext
>>> 2007-01-26 15:55:53,613 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-html/plugin.xml
>>> 2007-01-26 15:55:53,614 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.parse.Parser 
>>> class=org.apache.nutch.parse.html.HtmlParser
>>> 2007-01-26 15:55:53,615 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-js
>>> 2007-01-26 15:55:53,615 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-msword
>>> 2007-01-26 15:55:53,615 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-pdf
>>> 2007-01-26 15:55:53,615 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-rss
>>> 2007-01-26 15:55:53,615 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/parse-text/plugin.xml
>>> 2007-01-26 15:55:53,617 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.parse.Parser 
>>> class=org.apache.nutch.parse.text.TextParser
>>> 2007-01-26 15:55:53,617 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/protocol-file
>>> 2007-01-26 15:55:53,618 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/protocol-ftp
>>> 2007-01-26 15:55:53,618 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/protocol-http/plugin.xml
>>> 2007-01-26 15:55:53,619 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.protocol.Protocol 
>>> class=org.apache.nutch.protocol.http.Http
>>> 2007-01-26 15:55:53,620 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/protocol-httpclient
>>> 2007-01-26 15:55:53,620 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/query-basic/plugin.xml
>>> 2007-01-26 15:55:53,622 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.searcher.QueryFilter 
>>> class=org.apache.nutch.searcher.basic.BasicQueryFilter
>>> 2007-01-26 15:55:53,622 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/query-more
>>> 2007-01-26 15:55:53,622 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/query-site/plugin.xml
>>> 2007-01-26 15:55:53,624 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.searcher.QueryFilter 
>>> class=org.apache.nutch.searcher.site.SiteQueryFilter
>>> 2007-01-26 15:55:53,624 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/query-url/plugin.xml
>>> 2007-01-26 15:55:53,626 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.searcher.QueryFilter 
>>> class=org.apache.nutch.searcher.url.URLQueryFilter
>>> 2007-01-26 15:55:53,626 INFO  [STDOUT] 070126 155553 not including: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/urlfilter-prefix
>>> 2007-01-26 15:55:53,626 INFO  [STDOUT] 070126 155553 parsing: 
>>>
>>>     
>>>       
> /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
>   
>>   
>>     
>>> ses/plugins/urlfilter-regex/plugin.xml
>>> 2007-01-26 15:55:53,628 INFO  [STDOUT] 070126 155553 impl: 
>>> point=org.apache.nutch.net.URLFilter 
>>> class=org.apache.nutch.net.RegexURLFilter
>>> 2007-01-26 15:55:53,639 INFO  [STDOUT] 070126 155553 10 creating new bean
>>> 2007-01-26 15:55:53,640 INFO  [STDOUT] 070126 155553 10 opening segment 
>>> indexes in /srv/opt/nutch-0.7.2/crawl.db/segments
>>> 2007-01-26 15:55:53,652 ERROR [org.jboss.web.localhost.Engine] 
>>> StandardWrapperValve[jsp]: Servlet.service() for servlet jsp threw
>>>     
>>>       
>> exception
>>   
>>     
>>> java.lang.ArrayIndexOutOfBoundsException
>>>
>>>
>>>
>>> In my Browser i got the following Error ...
>>>
>>>
>>>   HTTP Status 500 -
>>>
>>> ------------------------------------------------------------------------
>>>
>>> *type* Exception report
>>>
>>> *message*
>>>
>>> *description* _The server encountered an internal error () that 
>>> prevented it from fulfilling this request._
>>>
>>> *exception*
>>>
>>> org.apache.jasper.JasperException
>>>     
>>>
>>>     
>>>       
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3
>   
>>   
>>     
>>> 72)
>>>     
>>> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
>>>     org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
>>>     javax.servlet.http.HttpServlet.service(HttpServlet.java:810)
>>>     
>>>
>>>     
>>>       
> org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.ja
>   
>>   
>>     
>>> va:75)
>>>
>>> *root cause*
>>>
>>> java.lang.ArrayIndexOutOfBoundsException
>>>
>>> *note* _The full stack trace of the root cause is available in the 
>>> Apache Tomcat/5.0.28 logs._
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>>       Apache Tomcat/5.0.28
>>>
>>>
>>>
>>> I also tested this Search on a newly created Index ( a small one ) but 
>>> got the same error. I Also tried to run Nutch-0.8.1 but still the same. 
>>> Also I couldn't find any information about this error and now I don't 
>>> know what to do. Maybe you have got a idea...
>>>
>>> Thanks in advance...
>>>
>>> Yours sincerely,
>>> Erik H.
>>>
>>>
>>>   
>>>     
>>>       
>>
>>   
>>     
>
>
>
>
>   

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to