Well I guess that db is linkdb for ver 0.7 .
Any way there is not much info maybe you can find more info in the
Catalina.out ...
One more thing to look for just maybe it is the reason (long shut)... check
each of your segment folders and verify that it contains all the 5 folders
i.e. content,crawl_generate,crawl_parse,parse_data,parse_text
HTH
Gal.
-----Original Message-----
From: Erik Höschler [mailto:[EMAIL PROTECTED]
Sent: Friday, January 26, 2007 5:58 PM
To: [email protected]
Subject: Re: Problems Searching an Index with Nutch
Hi,
I checked my FolderStructure and everything seems to be correct...
:/opt/nutch/crawl.db# l
insgesamt 8
drwxr-xr-x 3 root root 53 2007-01-19 14:11 db
drwxr-xr-x 2 root root 4096 2007-01-19 14:18 index
drwxr-xr-x 12 root root 4096 2007-01-26 15:06 segments
I'm not sure if I've ever had a linkdb Folder or did you mean the db
folder listed above?
Greetings,
Erik
Gal Nitzan schrieb:
Hi,
I'm not sure but it seems to me you are missing the linkdb and segments
folder. It should be located on the same level as the index folder.
HTH/
Gal
-----Original Message-----
From: Erik Höschler [mailto:[EMAIL PROTECTED]
Sent: Friday, January 26, 2007 5:04 PM
To: [email protected]
Cc: Erik
Subject: Problems Searching an Index with Nutch
Hi,
I'm running Nutch-0.7.2. I created an Index for my local Lan which
consists of 45.000 Pages.
I can inspect this Index with Luke an everything looks fine. When I try
to start a search Query with Nutch
I can see the following Exception in my JBOSS Logfile (at the End of the
Log).
//Here I'm redploying the Nutch.war Archive....
2007-01-26 15:55:06,611 INFO [org.jboss.web.tomcat.tc5.TomcatDeployer]
deploy, ctxPath=/nutch,
warUrl=file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/
2007-01-26 15:55:06,831 DEBUG [tomcat.localhost./nutch.Context] Starting
tomcat.localhost./nutch.Context
2007-01-26 15:55:06,832 DEBUG [tomcat.localhost./nutch.Context]
Configuring default Resources
2007-01-26 15:55:06,836 DEBUG [tomcat.localhost./nutch.Context]
Processing standard container startup
2007-01-26 15:55:06,844 DEBUG [tomcat.localhost./nutch.Context] Setting
deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web
Application 2.3//EN'
2007-01-26 15:55:06,862 DEBUG [tomcat.localhost./nutch.Context] Setting
deployment descriptor public ID to '-//Sun Microsystems, Inc.//DTD Web
Application 2.3//EN'
2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Posting
standard context attributes
2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context]
Configuring application event listeners
2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Sending
application start events
2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context] Starting
filters
2007-01-26 15:55:06,866 DEBUG [tomcat.localhost./nutch.Context]
Starting filter 'CommonHeadersFilter'
2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Starting
completed //Archive successfully loaded...?!?!
2007-01-26 15:55:06,867 DEBUG [tomcat.localhost./nutch.Context] Checking
for
jboss.web:j2eeType=WebModule,name=//localhost/nutch,J2EEApplication=none,J2E
EServer=none
//Here I startet a query in my Webbrowser...
2007-01-26 15:55:53,585 INFO [STDOUT] 070126 155553 parsing
file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
/classes/nutch-default.xml
2007-01-26 15:55:53,591 INFO [STDOUT] 070126 155553 parsing
file:/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF
/classes/nutch-site.xml
2007-01-26 15:55:53,599 INFO [STDOUT] 070126 155553 Plugins: looking
in:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/clustering-carrot2
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/creativecommons
2007-01-26 15:55:53,600 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/index-basic/plugin.xml
2007-01-26 15:55:53,607 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.indexer.IndexingFilter
class=org.apache.nutch.indexer.basic.BasicIndexingFilter
2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/index-more
2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/language-identifier
2007-01-26 15:55:53,609 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/nutch-extensionpoints/plugin.xml
2007-01-26 15:55:53,612 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/ontology
2007-01-26 15:55:53,612 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-ext
2007-01-26 15:55:53,613 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-html/plugin.xml
2007-01-26 15:55:53,614 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.parse.Parser
class=org.apache.nutch.parse.html.HtmlParser
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-js
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-msword
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-pdf
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-rss
2007-01-26 15:55:53,615 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/parse-text/plugin.xml
2007-01-26 15:55:53,617 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.parse.Parser
class=org.apache.nutch.parse.text.TextParser
2007-01-26 15:55:53,617 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-file
2007-01-26 15:55:53,618 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-ftp
2007-01-26 15:55:53,618 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-http/plugin.xml
2007-01-26 15:55:53,619 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.protocol.Protocol
class=org.apache.nutch.protocol.http.Http
2007-01-26 15:55:53,620 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/protocol-httpclient
2007-01-26 15:55:53,620 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-basic/plugin.xml
2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.searcher.QueryFilter
class=org.apache.nutch.searcher.basic.BasicQueryFilter
2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-more
2007-01-26 15:55:53,622 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-site/plugin.xml
2007-01-26 15:55:53,624 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.searcher.QueryFilter
class=org.apache.nutch.searcher.site.SiteQueryFilter
2007-01-26 15:55:53,624 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/query-url/plugin.xml
2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.searcher.QueryFilter
class=org.apache.nutch.searcher.url.URLQueryFilter
2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 not including:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/urlfilter-prefix
2007-01-26 15:55:53,626 INFO [STDOUT] 070126 155553 parsing:
/srv/opt/jboss-3.2.6/server/ecs_cs/tmp/deploy/tmp31541nutch.war/WEB-INF/clas
ses/plugins/urlfilter-regex/plugin.xml
2007-01-26 15:55:53,628 INFO [STDOUT] 070126 155553 impl:
point=org.apache.nutch.net.URLFilter
class=org.apache.nutch.net.RegexURLFilter
2007-01-26 15:55:53,639 INFO [STDOUT] 070126 155553 10 creating new bean
2007-01-26 15:55:53,640 INFO [STDOUT] 070126 155553 10 opening segment
indexes in /srv/opt/nutch-0.7.2/crawl.db/segments
2007-01-26 15:55:53,652 ERROR [org.jboss.web.localhost.Engine]
StandardWrapperValve[jsp]: Servlet.service() for servlet jsp threw
exception
java.lang.ArrayIndexOutOfBoundsException
In my Browser i got the following Error ...
HTTP Status 500 -
------------------------------------------------------------------------
*type* Exception report
*message*
*description* _The server encountered an internal error () that
prevented it from fulfilling this request._
*exception*
org.apache.jasper.JasperException
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3
72)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
javax.servlet.http.HttpServlet.service(HttpServlet.java:810)
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.ja
va:75)
*root cause*
java.lang.ArrayIndexOutOfBoundsException
*note* _The full stack trace of the root cause is available in the
Apache Tomcat/5.0.28 logs._
------------------------------------------------------------------------
Apache Tomcat/5.0.28
I also tested this Search on a newly created Index ( a small one ) but
got the same error. I Also tried to run Nutch-0.8.1 but still the same.
Also I couldn't find any information about this error and now I don't
know what to do. Maybe you have got a idea...
Thanks in advance...
Yours sincerely,
Erik H.