Please do not cross-post to multiple groups.
This is very annoying subscribed to both.

Thanks,
Arent-Jan

----- Oorspronkelijk bericht -----
Van: Diane Palla <[EMAIL PROTECTED]>
Datum: vrijdag, augustus 19, 2005 3:55 pm
Onderwerp: Fw: Crawl produced no search results.

> My crawl apparently created no indexes for the search to produce 
> any 
> search results.
> 
> For intranets that require BASIC authentication, how do configure 
> it to 
> crawl ?  How do I tell Nutch the username and password and 
> credentials so 
> it can access my intranet site? 
> 
> 
> I also am installing Nutch on the same computer that the intranet 
> is 
> hosted on.  Alternatively, can it search filesystems and produce 
> the 
> mappings for the html pages?
> 
> 
> Diane Palla
> Web Services Developer
> Seton Hall University
> 973 313-6199
> [EMAIL PROTECTED]
> 
> 
> 
> 
> Piotr Kosiorowski <[EMAIL PROTECTED]> 
> 08/18/2005 03:26 PM
> Please respond to
> [email protected]
> 
> 
> To
> [email protected]
> cc
> 
> Subject
> Re: Search Java JSP error after configuration and set up.  Please 
> help.
> 
> 
> 
> 
> 
> Please make sure you started tomcat from crawl.test directory (or 
> have 
> it configured in nutch-default.xml in *.war file)
> Regards
> Piotr
> Diane Palla wrote:
> > I am trying to set up Nutch with an intranet.  I used Nutch 0.7 
> with 
> Java 
> > J2SE 1.4.2 and Tomcat 4.1.31.
> > 
> > I did the crawl with the command
> > 
> > bin/nutch crawl bin/urls.txt -dir crawl.test -depth 3 >& crawl.log
> > 
> > 
> > and the crawl.log gave log messages that appeared to imply that 
> it was a 
> 
> 
> > successful run.  (Crawl.log is copied after the Java/JSP errors 
> below)> 
> > and I set JAVA_HOME and NUTCH_JAVA_HOME to the J2re when I did 
> the 
> crawl, 
> > but I set JAVA_HOME to the j2se when I ran tomcat and i went to 
> > http://localhost:8080
> > 
> > I tried to search something and
> > 
> > I got this error of the Nutch Bean.
> > 
> > Did I configure something wrong?  How can I fix this?
> > 
> > 
> > Diane Palla
> > Web Services Developer
> > Seton Hall University
> > 973 313-6199
> > [EMAIL PROTECTED]
> > 
> > 
> > 
> > org.apache.jasper.JasperException
> >                  at 
> > 
>
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:207)
> >                  at 
> > 
>
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:240)>
                 at 
> > org.apache.jasper.servlet.JspServlet.service(JspServlet.java:187)
> >                  at 
> > javax.servlet.http.HttpServlet.service(HttpServlet.java:809)
> >                  at 
> > 
>
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:200)
> >                  at 
> > 
>
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:146)
> >                  at 
> > 
>
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:209)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:144)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
> org.apache.catalina.core.StandardContext.invoke(StandardContext.java:2358)
> >                  at 
> > 
>
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:133)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.valves.ErrorDispatcherValve.invoke(ErrorDispatcherValve.java:118)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:594)
> >                  at 
> > 
>
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:116)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:594)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:127)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.coyote.tomcat4.CoyoteAdapter.service(CoyoteAdapter.java:152)>
                 at 
> > 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:799)
> >                  at 
> > 
>
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)
> >                  at 
> > 
> org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
> >                  at 
> > 
>
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)
> >                  at java.lang.Thread.run(Thread.java:534)
> > 
> > root cause 
> > java.lang.NullPointerException
> >                  at 
> > org.apache.nutch.searcher.NutchBean.init(NutchBean.java:96)
> >                  at 
> > org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:82)
> >                  at 
> > org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:72)
> >                  at 
> > org.apache.nutch.searcher.NutchBean.get(NutchBean.java:64)
> >                  at 
> > org.apache.jsp.search_jsp._jspService(search_jsp.java:108)
> >                  at 
> > org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:92)
> >                  at 
> > javax.servlet.http.HttpServlet.service(HttpServlet.java:809)
> >                  at 
> > 
>
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:162)
> >                  at 
> > 
>
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:240)>
                 at 
> > org.apache.jasper.servlet.JspServlet.service(JspServlet.java:187)
> >                  at 
> > javax.servlet.http.HttpServlet.service(HttpServlet.java:809)
> >                  at 
> > 
>
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:200)
> >                  at 
> > 
>
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:146)
> >                  at 
> > 
>
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:209)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:144)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
> org.apache.catalina.core.StandardContext.invoke(StandardContext.java:2358)
> >                  at 
> > 
>
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:133)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.valves.ErrorDispatcherValve.invoke(ErrorDispatcherValve.java:118)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:594)
> >                  at 
> > 
>
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:116)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:594)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:127)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(StandardPipeline.java:596)
> >                  at 
> > 
>
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
> >                  at 
> > 
> org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)>
                 at 
> > 
>
org.apache.coyote.tomcat4.CoyoteAdapter.service(CoyoteAdapter.java:152)>
                 at 
> > 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:799)
> >                  at 
> > 
>
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)
> >                  at 
> > 
> org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
> >                  at 
> > 
>
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)
> >                  at java.lang.Thread.run(Thread.java:534)
> > 
> > 
> > 
> > Crawl.log:
> > 
> > run java in /usr/java/j2re1.4.2_02
> > 050818 140148 parsing 
> > file:/gartner/httpd/html/nutch-0.7/conf/nutch-default.xml
> > 050818 140149 parsing 
> > file:/gartner/httpd/html/nutch-0.7/conf/crawl-tool.xml
> > 050818 140149 parsing 
> > file:/gartner/httpd/html/nutch-0.7/conf/nutch-site.xml
> > 050818 140149 No FS indicated, using default:local
> > 050818 140149 crawl started in: crawl.test
> > 050818 140149 rootUrlFile = bin/urls.txt
> > 050818 140149 threads = 10
> > 050818 140149 depth = 3
> > 050818 140149 Created webdb at 
> > LocalFS,/gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140149 Starting URL processing
> > 050818 140149 Plugins: looking in: /gartner/httpd/html/nutch-
> 0.7/plugins> 050818 140149 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/clustering-carrot2
> > 050818 140149 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/creativecommons
> > 050818 140149 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/index-basic/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.indexer.IndexingFilter 
> > class=org.apache.nutch.indexer.basic.BasicIndexingFilter
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/index-more
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/language-identifier
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/ontology
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-ext
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-html/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.parse.Parser 
> > class=org.apache.nutch.parse.html.HtmlParser
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-js/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.parse.Parser 
> > class=org.apache.nutch.parse.js.JSParseFilter
> > 050818 140150 impl: point=org.apache.nutch.parse.HtmlParseFilter 
> > class=org.apache.nutch.parse.js.JSParseFilter
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-msword
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-pdf
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-rss
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/parse-text/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.parse.Parser 
> > class=org.apache.nutch.parse.text.TextParser
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/protocol-file
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/protocol-ftp
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/protocol-http
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/protocol-httpclient/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.protocol.Protocol 
> > class=org.apache.nutch.protocol.httpclient.Http
> > 050818 140150 impl: point=org.apache.nutch.protocol.Protocol 
> > class=org.apache.nutch.protocol.httpclient.Http
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/query-basic/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.searcher.QueryFilter 
> > class=org.apache.nutch.searcher.basic.BasicQueryFilter
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/query-more
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/query-site/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.searcher.QueryFilter 
> > class=org.apache.nutch.searcher.site.SiteQueryFilter
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/query-url/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.searcher.QueryFilter 
> > class=org.apache.nutch.searcher.url.URLQueryFilter
> > 050818 140150 not including: 
> > /gartner/httpd/html/nutch-0.7/plugins/urlfilter-prefix
> > 050818 140150 parsing: 
> > /gartner/httpd/html/nutch-0.7/plugins/urlfilter-regex/plugin.xml
> > 050818 140150 impl: point=org.apache.nutch.net.URLFilter 
> > class=org.apache.nutch.net.RegexURLFilter
> > 050818 140150 found resource crawl-urlfilter.txt at 
> > file:/gartner/httpd/html/nutch-0.7/conf/crawl-urlfilter.txt
> > 050818 140150 Using URL normalizer: 
> > org.apache.nutch.net.BasicUrlNormalizer
> > 050818 140150 Added 1 pages
> > 050818 140150 Processing pagesByURL: Sorted 1 instructions in 
> 0.014 
> > seconds.
> > 050818 140150 Processing pagesByURL: Sorted 71.42857142857143 
> > instructions/second
> > 050818 140150 Processing pagesByURL: Merged to new DB containing 
> 1 
> records 
> > in 0.0070 seconds
> > 050818 140150 Processing pagesByURL: Merged 142.85714285714286 
> > records/second
> > 050818 140150 Processing pagesByMD5: Sorted 1 instructions in 
> 0.0020 
> > seconds.
> > 050818 140150 Processing pagesByMD5: Sorted 500.0 
> instructions/second> 050818 140150 Processing pagesByMD5: Merged to 
> new DB containing 1 
> records 
> > in 0.0030 seconds
> > 050818 140150 Processing pagesByMD5: Merged 333.3333333333333 
> > records/second
> > 050818 140150 Processing linksByMD5: Copied file (4096 bytes) in 
> 0.01 
> > secs.
> > 050818 140150 Processing linksByURL: Copied file (4096 bytes) in -
> 0.0020 
> 
> 
> > secs.
> > 050818 140150 FetchListTool started
> > 050818 140151 Processing pagesByURL: Sorted 1 instructions in 
> 0.106 
> > seconds.
> > 050818 140151 Processing pagesByURL: Sorted 9.433962264150944 
> > instructions/second
> > 050818 140151 Processing pagesByURL: Merged to new DB containing 
> 1 
> records 
> > in 0.0 seconds
> > 050818 140151 Processing pagesByURL: Merged Infinity records/second
> > 050818 140151 Processing pagesByMD5: Sorted 1 instructions in 
> 0.0020 
> > seconds.
> > 050818 140151 Processing pagesByMD5: Sorted 500.0 
> instructions/second> 050818 140151 Processing pagesByMD5: Merged to 
> new DB containing 1 
> records 
> > in 0.0020 seconds
> > 050818 140151 Processing pagesByMD5: Merged 500.0 records/second
> > 050818 140151 Processing linksByMD5: Copied file (4096 bytes) in 
> 0.0010 
> > secs.
> > 050818 140151 Processing linksByURL: Copied file (4096 bytes) in 
> 0.0020 
> > secs.
> > 050818 140151 Processing 
> > 
> /gartner/httpd/html/nutch-
> 0.7/crawl.test/segments/20050818140150/fetchlist.unsorted: 
> 
> 
> > Sorted 1 entries in 0.011 seconds.
> > 050818 140151 Processing 
> > 
> /gartner/httpd/html/nutch-
> 0.7/crawl.test/segments/20050818140150/fetchlist.unsorted: 
> 
> 
> > Sorted 90.90909090909092 entries/second
> > 050818 140151 Overall processing: Sorted 1 entries in 0.011 seconds.
> > 050818 140151 Overall processing: Sorted 0.011 entries/second
> > 050818 140151 FetchListTool completed
> > 050818 140151 logging at INFO
> > 050818 140151 fetching http://gartner.shu.edu/
> > 050818 140151 http.proxy.host = null
> > 050818 140151 http.proxy.port = 8080
> > 050818 140151 http.timeout = 10000
> > 050818 140151 http.content.limit = 65536
> > 050818 140151 http.agent = NutchCVS/0.7 (Nutch; 
> > http://lucene.apache.org/nutch/bot.html; nutch-
> [EMAIL PROTECTED])> 050818 140151 http.auth.ntlm.username = 
> > 050818 140151 fetcher.server.delay = 1000
> > 050818 140151 http.max.delays = 100
> > 050818 140152 Configured Client
> > 050818 140152 basic authentication scheme selected
> > 050818 140152 basic authentication scheme selected
> > 050818 140153 Updating /gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140154 Updating for 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140150
> > 050818 140154 Processing document 0
> > 050818 140154 Finishing update
> > 050818 140154 Processing pagesByURL: Sorted 1 instructions in 
> 0.0060 
> > seconds.
> > 050818 140154 Processing pagesByURL: Sorted 166.66666666666666 
> > instructions/second
> > 050818 140154 Processing pagesByURL: Merged to new DB containing 
> 1 
> records 
> > in 0.0010 seconds
> > 050818 140154 Processing pagesByURL: Merged 1000.0 records/second
> > 050818 140154 Processing pagesByMD5: Sorted 1 instructions in 
> 0.0050 
> > seconds.
> > 050818 140154 Processing pagesByMD5: Sorted 200.0 
> instructions/second> 050818 140154 Processing pagesByMD5: Merged to 
> new DB containing 1 
> records 
> > in 0.0 seconds
> > 050818 140154 Processing pagesByMD5: Merged Infinity records/second
> > 050818 140154 Processing linksByMD5: Copied file (4096 bytes) in 
> 0.0020 
> > secs.
> > 050818 140154 Processing linksByURL: Copied file (4096 bytes) in 
> 0.0040 
> > secs.
> > 050818 140154 Update finished
> > 050818 140154 FetchListTool started
> > 050818 140154 Overall processing: Sorted 0 entries in 0.0 seconds.
> > 050818 140154 Overall processing: Sorted NaN entries/second
> > 050818 140154 FetchListTool completed
> > 050818 140154 logging at INFO
> > 050818 140155 Updating /gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140155 Updating for 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140154
> > 050818 140155 Finishing update
> > 050818 140155 Update finished
> > 050818 140155 FetchListTool started
> > 050818 140156 Overall processing: Sorted 0 entries in 0.0 seconds.
> > 050818 140156 Overall processing: Sorted NaN entries/second
> > 050818 140156 FetchListTool completed
> > 050818 140156 logging at INFO
> > 050818 140157 Updating /gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140157 Updating for 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140156
> > 050818 140157 Finishing update
> > 050818 140157 Update finished
> > 050818 140157 Updating /gartner/httpd/html/nutch-
> 0.7/crawl.test/segments 
> 
> 
> > from /gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140157  reading 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140150
> > 050818 140157  reading 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140154
> > 050818 140157  reading 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140156
> > 050818 140157 Sorting pages by url...
> > 050818 140157 Getting updated scores and anchors from db...
> > 050818 140157 Sorting updates by segment...
> > 050818 140157 Updating segments...
> > 050818 140157  updating 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140150
> > 050818 140157 Done updating 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments from 
> > /gartner/httpd/html/nutch-0.7/crawl.test/db
> > 050818 140158 indexing segment: 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140150
> > 050818 140158 * Opening segment 20050818140150
> > 050818 140158 * Indexing segment 20050818140150
> > 050818 140158 * Optimizing index...
> > 050818 140158 * Moving index to NFS if needed...
> > 050818 140158 DONE indexing segment 20050818140150: total 1 
> records in 
> > 0.034 s (Infinity rec/s).
> > 050818 140158 done indexing
> > 050818 140158 indexing segment: 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140154
> > 050818 140158 * Opening segment 20050818140154
> > 050818 140158 * Indexing segment 20050818140154
> > 050818 140158 * Optimizing index...
> > 050818 140158 * Moving index to NFS if needed...
> > 050818 140158 DONE indexing segment 20050818140154: total 0 
> records in 
> > 0.046 s (NaN rec/s).
> > 050818 140158 done indexing
> > 050818 140158 indexing segment: 
> > /gartner/httpd/html/nutch-0.7/crawl.test/segments/20050818140156
> > 050818 140158 * Opening segment 20050818140156
> > 050818 140158 * Indexing segment 20050818140156
> > 050818 140158 * Optimizing index...
> > 050818 140158 * Moving index to NFS if needed...
> > 050818 140158 DONE indexing segment 20050818140156: total 0 
> records in 
> > 0.071 s (NaN rec/s).
> > 050818 140158 done indexing
> > 050818 140158 Reading url hashes...
> > 050818 140158 Sorting url hashes...
> > 050818 140158 Deleting url duplicates...
> > 050818 140158 Deleted 0 url duplicates.
> > 050818 140158 Reading content hashes...
> > 050818 140158 Sorting content hashes...
> > 050818 140158 Deleting content duplicates...
> > 050818 140158 Deleted 0 content duplicates.
> > 050818 140158 Duplicate deletion complete locally.  Now returning 
> to 
> > NFS...
> > 050818 140158 DeleteDuplicates complete
> > 050818 140158 Merging segment indexes... 
> > 050818 140158 crawl finished: crawl.test
> 
> 

Reply via email to