When I ran "bin/nutch generate db segments -topN 50000" it got this error message.

060118 191222 Processing segments/20060118191140/fetchlist.unsorted: Sorted 5779.678649867067 entries/second
060118 191222 Overall processing: Sorted 50000 entries in 8.651 seconds.
060118 191222 Overall processing: Sorted 1.7302E-4 entries/second
Exception in thread "main" java.io.IOException: File already exists:db/webdb/linksByMD5/data
at org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:135)
at org.apache.nutch.fs.LocalFileSystem.create(LocalFileSystem.java:102)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:57)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:78)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:78)
at org.apache.nutch.fs.LocalFileSystem.rename(LocalFileSystem.java:149)
at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1676)
at org.apache.nutch.tools.FetchListTool.emitFetchList(FetchListTool.java:499) at org.apache.nutch.tools.FetchListTool.emitFetchList(FetchListTool.java:319)
at org.apache.nutch.tools.FetchListTool.main(FetchListTool.java:593)

The only thing I can think of is that the db.default.fetch.interval has expired so sites will bee re-fetched. Any need to worry?

Reply via email to