Hello Team,

Partial false alarm.

I have worked out that I get exactly the same error, if the nutch server is NOT
running!  So, perhaps my tomcat search client

-  is not finding the /hosts/search-servers.txt file; or
-  is not interpreting the "address port" line in it

I find that I CAN telnet from the command line to port 8081:

# telnet 193.203.240.118 8081
Trying 193.203.240.118...
Connected to nutch1.houxou.com (193.203.240.118).
Escape character is '^]'.

In this case, I get the following diagnostic output from the "nutch server"
console:

060317 112919 22 Server connection on port 8081 from 193.203.240.118: starting

However when the tomcat search client tries to search there is NO output from
the "nutch server" console.

Sounds like I'm getting closer to the problem, but help still gratefully
awaited! :)

Many thanks,

Monu Ogbe



Quoting [EMAIL PROTECTED]:

Hi Andrzej,

I am running 0.8-dev revision 374745.

Searching works fine when the tomcat search client's searcher.dir is configured
to point at the crawl directory as follows.

*** $CATALINA_HOME/webapps/ROOT/WEB-INF/classes/nutch-site.xml contains:

        <property>
          <name>searcher.dir</name>
           <value>/home/nutch/nutch-0.8-dev-test/crawlA/</value>
          <description>
          Path to root of index directories.
          </description>
        </property>

However, I get an error from the tomcat search client when I try to set up
distributed search using the following config:

        <property>
          <name>searcher.dir</name>
           <value>/hosts</value>
          <description>
          Path to root of index directories.
          </description>
        </property>

*** /hosts/search-servers.txt contains:


nutch1.houxou.com 8081


*** crawl directory tree looks like this:

crawlA/
crawlA/linkdb
crawlA/linkdb/current
crawlA/linkdb/current/part-00000
crawlA/linkdb/current/part-00000/index
crawlA/linkdb/current/part-00000/data
crawlA/linkdb/current/part-00000/.data.crc
crawlA/linkdb/current/part-00000/.index.crc
crawlA/indexes
crawlA/indexes/part-00000
crawlA/indexes/part-00000/_2.f2
crawlA/indexes/part-00000/_2.tis
crawlA/indexes/part-00000/deletable
crawlA/indexes/part-00000/_2.f3
crawlA/indexes/part-00000/_2.frq
crawlA/indexes/part-00000/_2.f4
crawlA/indexes/part-00000/_2.tii
crawlA/indexes/part-00000/_2.fdt
crawlA/indexes/part-00000/index.done
crawlA/indexes/part-00000/_2.f1
crawlA/indexes/part-00000/_2.prx
crawlA/indexes/part-00000/_2.fnm
crawlA/indexes/part-00000/_2.f0
crawlA/indexes/part-00000/segments
crawlA/indexes/part-00000/_2.fdx
crawlA/crawldb
crawlA/crawldb/current
crawlA/crawldb/current/part-00000
crawlA/crawldb/current/part-00000/index
crawlA/crawldb/current/part-00000/data
crawlA/crawldb/current/part-00000/.data.crc
crawlA/crawldb/current/part-00000/.index.crc
crawlA/segments
crawlA/segments/20060316144827
crawlA/segments/20060316144827/crawl_generate
crawlA/segments/20060316144827/crawl_generate/part-00000
crawlA/segments/20060316144827/crawl_generate/.part-00000.crc
crawlA/segments/20060316144827/crawl_parse
crawlA/segments/20060316144827/crawl_parse/part-00000
crawlA/segments/20060316144827/crawl_parse/.part-00000.crc
crawlA/segments/20060316144827/parse_text
crawlA/segments/20060316144827/parse_text/part-00000
crawlA/segments/20060316144827/parse_text/part-00000/index
crawlA/segments/20060316144827/parse_text/part-00000/data
crawlA/segments/20060316144827/parse_text/part-00000/.data.crc
crawlA/segments/20060316144827/parse_text/part-00000/.index.crc
crawlA/segments/20060316144827/parse_data
crawlA/segments/20060316144827/parse_data/part-00000
crawlA/segments/20060316144827/parse_data/part-00000/index
crawlA/segments/20060316144827/parse_data/part-00000/data
crawlA/segments/20060316144827/parse_data/part-00000/.data.crc
crawlA/segments/20060316144827/parse_data/part-00000/.index.crc
crawlA/segments/20060316144827/content
crawlA/segments/20060316144827/content/part-00000
crawlA/segments/20060316144827/content/part-00000/index
crawlA/segments/20060316144827/content/part-00000/data
crawlA/segments/20060316144827/content/part-00000/.data.crc
crawlA/segments/20060316144827/content/part-00000/.index.crc
crawlA/segments/20060316144827/crawl_fetch
crawlA/segments/20060316144827/crawl_fetch/part-00000
crawlA/segments/20060316144827/crawl_fetch/part-00000/index
crawlA/segments/20060316144827/crawl_fetch/part-00000/data
crawlA/segments/20060316144827/crawl_fetch/part-00000/.data.crc
crawlA/segments/20060316144827/crawl_fetch/part-00000/.index.crc


*** Invoking the search server

I have tried invoking the search server pointing at the "crawl" directory,
crawlA and just for good measure I have also tried pointing at the "indexes"
directory within it.

        # bin/nutch server 8081 crawlA/indexes
or
        # bin/nutch server 8081 crawlA


*** The tomcat search client then produces the following output:

HTTP Status 500 -

type Exception report

message

description The server encountered an internal error () that prevented it from
fulfilling this request.

exception

org.apache.jasper.JasperException
        
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:510)
        
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:393)
        org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
        org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

root cause

java.lang.NullPointerException
        org.apache.nutch.ipc.RPC.call(RPC.java:162)
        
org.apache.nutch.searcher.DistributedSearch$Client.updateSegments(DistributedSearch.java:157)
        
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:118)
        
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:92)
        org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:98)
        org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:80)
        org.apache.nutch.searcher.NutchBean.get(NutchBean.java:67)
        org.apache.jsp.search_jsp._jspService(search_jsp.java:108)
        org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
        org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
        org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

note The full stack trace of the root cause is available in the Apache
Tomcat/5.5.16 logs.

*** The tomcat logs show

# cat /usr/local/tomcat/logs/localhost.2006-03-16.log

16-Mar-2006 21:27:00 org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException
       at org.apache.nutch.ipc.RPC.call(RPC.java:162)
       at
org.apache.nutch.searcher.DistributedSearch$Client.updateSegments(DistributedSearch.java:157)
       at
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:118)
       at
org.apache.nutch.searcher.DistributedSearch$Client.<init>(DistributedSearch.java:92)
       at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:98)
       at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:80)
       at org.apache.nutch.searcher.NutchBean.get(NutchBean.java:67)
       at org.apache.jsp.search_jsp._jspService(search_jsp.java:108)
       at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
       at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
       at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
       at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
       at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
       at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
       at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
       at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
       at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
       at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
       at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
       at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
       at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
       at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
       at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
       at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
       at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
       at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
       at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
       at java.lang.Thread.run(Thread.java:595)

*** end

Is this a bug for which there is a patch, or are the directories in the wrong
places!?

Many thanks,

Monu Ogbe







-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to