Hi Andrzej,
I am running 0.8-dev revision 374745.
Searching works fine when the tomcat search client's searcher.dir is
configured
to point at the crawl directory as follows.
*** $CATALINA_HOME/webapps/ROOT/WEB-INF/classes/nutch-site.xml contains:
<property>
<name>searcher.dir</name>
<value>/home/nutch/nutch-0.8-dev-test/crawlA/</value>
<description>
Path to root of index directories.
</description>
</property>
However, I get an error from the tomcat search client when I try to set up
distributed search using the following config:
<property>
<name>searcher.dir</name>
<value>/hosts</value>
<description>
Path to root of index directories.
</description>
</property>
*** /hosts/search-servers.txt contains:
nutch1.houxou.com 8081
*** crawl directory tree looks like this:
crawlA/
crawlA/linkdb
crawlA/linkdb/current
crawlA/linkdb/current/part-00000
crawlA/linkdb/current/part-00000/index
crawlA/linkdb/current/part-00000/data
crawlA/linkdb/current/part-00000/.data.crc
crawlA/linkdb/current/part-00000/.index.crc
crawlA/indexes
crawlA/indexes/part-00000
crawlA/indexes/part-00000/_2.f2
crawlA/indexes/part-00000/_2.tis
crawlA/indexes/part-00000/deletable
crawlA/indexes/part-00000/_2.f3
crawlA/indexes/part-00000/_2.frq
crawlA/indexes/part-00000/_2.f4
crawlA/indexes/part-00000/_2.tii
crawlA/indexes/part-00000/_2.fdt
crawlA/indexes/part-00000/index.done
crawlA/indexes/part-00000/_2.f1
crawlA/indexes/part-00000/_2.prx
crawlA/indexes/part-00000/_2.fnm
crawlA/indexes/part-00000/_2.f0
crawlA/indexes/part-00000/segments
crawlA/indexes/part-00000/_2.fdx
crawlA/crawldb
crawlA/crawldb/current
crawlA/crawldb/current/part-00000
crawlA/crawldb/current/part-00000/index
crawlA/crawldb/current/part-00000/data
crawlA/crawldb/current/part-00000/.data.crc
crawlA/crawldb/current/part-00000/.index.crc
crawlA/segments
crawlA/segments/20060316144827
crawlA/segments/20060316144827/crawl_generate
crawlA/segments/20060316144827/crawl_generate/part-00000
crawlA/segments/20060316144827/crawl_generate/.part-00000.crc
crawlA/segments/20060316144827/crawl_parse
crawlA/segments/20060316144827/crawl_parse/part-00000
crawlA/segments/20060316144827/crawl_parse/.part-00000.crc
crawlA/segments/20060316144827/parse_text
crawlA/segments/20060316144827/parse_text/part-00000
crawlA/segments/20060316144827/parse_text/part-00000/index
crawlA/segments/20060316144827/parse_text/part-00000/data
crawlA/segments/20060316144827/parse_text/part-00000/.data.crc
crawlA/segments/20060316144827/parse_text/part-00000/.index.crc
crawlA/segments/20060316144827/parse_data
crawlA/segments/20060316144827/parse_data/part-00000
crawlA/segments/20060316144827/parse_data/part-00000/index
crawlA/segments/20060316144827/parse_data/part-00000/data
crawlA/segments/20060316144827/parse_data/part-00000/.data.crc
crawlA/segments/20060316144827/parse_data/part-00000/.index.crc
crawlA/segments/20060316144827/content
crawlA/segments/20060316144827/content/part-00000
crawlA/segments/20060316144827/content/part-00000/index
crawlA/segments/20060316144827/content/part-00000/data
crawlA/segments/20060316144827/content/part-00000/.data.crc
crawlA/segments/20060316144827/content/part-00000/.index.crc
crawlA/segments/20060316144827/crawl_fetch
crawlA/segments/20060316144827/crawl_fetch/part-00000
crawlA/segments/20060316144827/crawl_fetch/part-00000/index
crawlA/segments/20060316144827/crawl_fetch/part-00000/data
crawlA/segments/20060316144827/crawl_fetch/part-00000/.data.crc
crawlA/segments/20060316144827/crawl_fetch/part-00000/.index.crc
*** Invoking the search server
I have tried invoking the search server pointing at the "crawl" directory,
crawlA and just for good measure I have also tried pointing at the "indexes"
directory within it.
# bin/nutch server 8081 crawlA/indexes
or
# bin/nutch server 8081 crawlA
*** The tomcat search client then produces the following output:
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented it from
fulfilling this request.
exception
org.apache.jasper.JasperException
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:510)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
root cause
java.lang.NullPointerException
org.apache.nutch.ipc.RPC.call(RPC.java:162)
org.apache.nutch.searcher.DistributedSearch$Client.updateSegments(DistributedSearch.java:157)
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:118)
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:92)
org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:98)
org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:80)
org.apache.nutch.searcher.NutchBean.get(NutchBean.java:67)
org.apache.jsp.search_jsp._jspService(search_jsp.java:108)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
note The full stack trace of the root cause is available in the Apache
Tomcat/5.5.16 logs.
*** The tomcat logs show
# cat /usr/local/tomcat/logs/localhost.2006-03-16.log
16-Mar-2006 21:27:00 org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException
at org.apache.nutch.ipc.RPC.call(RPC.java:162)
at
org.apache.nutch.searcher.DistributedSearch$Client.updateSegments(DistributedSearch.java:157)
at
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:118)
at
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:92)
at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:98)
at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:80)
at org.apache.nutch.searcher.NutchBean.get(NutchBean.java:67)
at org.apache.jsp.search_jsp._jspService(search_jsp.java:108)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
at java.lang.Thread.run(Thread.java:595)
*** end
Is this a bug for which there is a patch, or are the directories in the wrong
places!?
Many thanks,
Monu Ogbe
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general