When I ran "bin/nutch generate db segments -topN 50000" it got this error message.

060118 191222 Processing segments/20060118191140/fetchlist.unsorted: Sorted 5779.678649867067 entries/second
060118 191222 Overall processing: Sorted 50000 entries in 8.651 seconds.
060118 191222 Overall processing: Sorted 1.7302E-4 entries/second
Exception in thread "main" java.io.IOException: File already exists:db/webdb/linksByMD5/data
at org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:135)
at org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:102)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:57)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:78)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:78)
at org.apache.nutch.fs.LocalFileSystem.rename(LocalFileSystem.java:149)
at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1676)
at org.apache.nutch.tools.FetchListTool.emitFetchList(FetchListTool.java:499) at org.apache.nutch.tools.FetchListTool.emitFetchList(FetchListTool.java:319)
at org.apache.nutch.tools.FetchListTool.main(FetchListTool.java:593)

The only thing I can think of is that the db.default.fetch.interval has expired so sites will bee re-fetched. Any need to worry?



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to