Hi,

I checked my FolderStructure and everything seems to be correct...

:/opt/nutch/crawl.db# l
insgesamt 8
drwxr-xr-x   3 root root   53 2007-01-19 14:11 db
drwxr-xr-x   2 root root 4096 2007-01-19 14:18 index
drwxr-xr-x  12 root root 4096 2007-01-26 15:06 segments

I'm not sure if I've ever had a linkdb Folder or did you mean the db folder listed above?

Greetings,
Erik

Gal Nitzan schrieb:
Hi,

I'm not sure but it seems to me you are missing the linkdb and segments
folder. It should be located on the same level as the index folder.

HTH/

Gal

-----Original Message-----
From: Erik Höschler [mailto:[EMAIL PROTECTED] Sent: Friday, January 26, 2007 5:04 PM
To: [email protected]
Cc: Erik
Subject: Problems Searching an Index with Nutch

Hi,

I'm running Nutch-0.7.2. I created an Index for my local Lan which consists of 45.000 Pages. I can inspect this Index with Luke an everything looks fine. When I try to start a search Query with Nutch I can see the following Exception in my JBOSS Logfile (at the End of the Log).


//Here I'm redploying the Nutch.war Archive....
2007-01-26 15:55:06,611 INFO [org.jboss.web.tomcat.tc5.TomcatDeployer] deploy, ctxPath=/nutch, warUrl=file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/ 2007-01-26 15:55:06,831 DEBUG [tomcat.localhost./nutch.Context] Starting tomcat.localhost./nutch.Context 2007-01-26 15:55:06,832 DEBUG [tomcat.localhost./nutch.Context] Configuring default Resources 2007-01-26 15:55:06,836 DEBUG [tomcat.localhost./nutch.Context] Processing standard container startup 2007-01-26 15:55:06,844 DEBUG [tomcat.localhost./nutch.Context] Setting deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN' 2007-01-26 15:55:06,862 DEBUG [tomcat.localhost./nutch.Context] Setting deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN' 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Posting standard context attributes 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Configuring application event listeners 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Sending application start events 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Starting filters 2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Starting filter 'CommonHeadersFilter' 2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Starting completed //Archive successfully loaded...?!?! 2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Checking for jboss.web:j2eeType=WebModule,name=//localhost/nutch,J2EEApplication=none,J2E
EServer=none


//Here I startet a query in my Webbrowser...
2007-01-26 15:55:53,585 INFO [STDOUT] 070126 155553 parsing file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
/classes/nutch-default.xml
2007-01-26 15:55:53,591 INFO [STDOUT] 070126 155553 parsing file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
/classes/nutch-site.xml
2007-01-26 15:55:53,599 INFO [STDOUT] 070126 155553 Plugins: looking in: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/clustering-carrot2
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/creativecommons
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/index-basic/plugin.xml
2007-01-26 15:55:53,607 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.indexer.IndexingFilter class=org.apache.nutch.indexer.basic.BasicIndexingFilter 2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/index-more
2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/language-identifier
2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/nutch-extensionpoints/plugin.xml
2007-01-26 15:55:53,612 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/ontology
2007-01-26 15:55:53,612 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-ext
2007-01-26 15:55:53,613 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-html/plugin.xml
2007-01-26 15:55:53,614 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.html.HtmlParser 2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-js
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-msword
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-pdf
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-rss
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-text/plugin.xml
2007-01-26 15:55:53,617 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.parse.Parser class=org.apache.nutch.parse.text.TextParser 2007-01-26 15:55:53,617 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-file
2007-01-26 15:55:53,618 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-ftp
2007-01-26 15:55:53,618 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-http/plugin.xml
2007-01-26 15:55:53,619 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.protocol.Protocol class=org.apache.nutch.protocol.http.Http 2007-01-26 15:55:53,620 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-httpclient
2007-01-26 15:55:53,620 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-basic/plugin.xml
2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.searcher.QueryFilter class=org.apache.nutch.searcher.basic.BasicQueryFilter 2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-more
2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-site/plugin.xml
2007-01-26 15:55:53,624 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.searcher.QueryFilter class=org.apache.nutch.searcher.site.SiteQueryFilter 2007-01-26 15:55:53,624 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-url/plugin.xml
2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.searcher.QueryFilter class=org.apache.nutch.searcher.url.URLQueryFilter 2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 not including: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/urlfilter-prefix
2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 parsing: /srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/urlfilter-regex/plugin.xml
2007-01-26 15:55:53,628 INFO [STDOUT] 070126 155553 impl: point=org.apache.nutch.net.URLFilter class=org.apache.nutch.net.RegexURLFilter
2007-01-26 15:55:53,639 INFO  [STDOUT] 070126 155553 10 creating new bean
2007-01-26 15:55:53,640 INFO [STDOUT] 070126 155553 10 opening segment indexes in /srv/opt/nutch-0.7.2/crawl.db/segments 2007-01-26 15:55:53,652 ERROR [org.jboss.web.localhost.Engine] StandardWrapperValve[jsp]: Servlet.service() for servlet jsp threw exception
java.lang.ArrayIndexOutOfBoundsException



In my Browser i got the following Error ...


  HTTP Status 500 -

------------------------------------------------------------------------

*type* Exception report

*message*

*description* _The server encountered an internal error () that prevented it from fulfilling this request._

*exception*

org.apache.jasper.JasperException
        
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3
72)
        
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
        org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:810)
        
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.ja
va:75)

*root cause*

java.lang.ArrayIndexOutOfBoundsException

*note* _The full stack trace of the root cause is available in the Apache Tomcat/5.0.28 logs._

------------------------------------------------------------------------


      Apache Tomcat/5.0.28



I also tested this Search on a newly created Index ( a small one ) but got the same error. I Also tried to run Nutch-0.8.1 but still the same. Also I couldn't find any information about this error and now I don't know what to do. Maybe you have got a idea...

Thanks in advance...

Yours sincerely,
Erik H.



Reply via email to