Dear Users!
I have a problem with nutch 0.6 on Suse with java 1.5:
I would like to update db from new segment and on the end the nutch generate the following error:
.......
050426 154309 Processing document 495000
050426 154311 Processing document 496000
050426 154312 Processing document 497000
050426 154314 Processing document 498000
050426 154315 Processing document 499000
050426 154317 Finishing update
050426 162604 Processing pagesByURL: Sorted 77421532 instructions in 2566.565 seconds.
050426 162604 Processing pagesByURL: Sorted 30165.42811111349 instructions/second
050426 164313 Processing pagesByURL: Merged to new DB containing 4523724 records in 503.623 seconds
050426 164313 Processing pagesByURL: Merged 8982.36180635118 records/second
050426 164406 Processing pagesByMD5: Sorted 4560431 instructions in 44.974 seconds.
050426 164406 Processing pagesByMD5: Sorted 101401.49864366079 instructions/second
050426 164504 Processing pagesByMD5: Merged to new DB containing 4523724 records in 50.646 seconds
050426 164504 Processing pagesByMD5: Merged 89320.45966117758 records/second
050426 165039 Processing linksByMD5: Sorted 21599781 instructions in 334.802 seconds.
050426 165039 Processing linksByMD5: Sorted 64515.089515594285 instructions/second
050426 165432 Processing linksByMD5: Merged to new DB containing 10793993 records in 161.061 seconds
050426 165432 Processing linksByMD5: Merged 67018.04285332887 records/second
Exception in thread "main" java.io.IOException: zero length keys not allowed
at net.nutch.io.SequenceFile$Writer.append(SequenceFile.java:106)
at net.nutch.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:641)
at net.nutch.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:542)
at net.nutch.io.SequenceFile$Sorter.mergePass(SequenceFile.java:489)
at net.nutch.io.SequenceFile$Sorter.sort(SequenceFile.java:322)
at net.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:522)
at net.nutch.db.WebDBWriter$LinksByURLProcessor.closeDown(WebDBWriter.java:1220)
at net.nutch.db.WebDBWriter.close(WebDBWriter.java:1557)
at net.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:301)
at net.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:351)
How to fix it? Which segment is bad?
Thanks: Ferenc
