Exception from crawl command

throwawayuseridfor-nutch Wed, 01 Mar 2006 08:56:29 -0800

Hi,

I've been experimenting with nutch and lucene,
everything was working fine, but now I'm getting an
exception thrown from the crawl command.


The command manages a few fetch cycles but then I get
the following message:

060301 161128 status: segment 20060301161046, 38
pages, 0 errors, 856591 bytes, 41199 ms
060301 161128 status: 0.92235243 pages/s, 162.43396
kb/s, 22541.87 bytes/page
060301 161129 Updating C:\PF\nutch-0.7.1\LIVE\db
060301 161129 Updating for
C:\PF\nutch-0.7.1\LIVE\segments\20060301161046
060301 161129 Processing document 0
060301 161130 Finishing update
060301 161130 Processing pagesByURL: Sorted 952
instructions in 0.02 seconds.
060301 161130 Processing pagesByURL: Sorted 47600.0
instructions/second
java.io.IOException: already exists:
C:\PF\nutch-0.7.1\LIVE\db\webdb.new\pagesByURL
        at
org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:86)
        at
org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:549)
        at
org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
        at
org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321)
        at
org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371)
        at
org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:141)
Exception in thread "main" 

Does anyone have any ideas what the problem is likely
to be.  I am running nutch 0.7.1

thanks,


Julian.

Exception from crawl command

Reply via email to