|
I have tried for three times to build an index (two with the intranet method and one with the full web method, from which the error message below), but when trying to update the database, I get the error message, stating that there is some out of order of some sites. I have used previous versions and I have not encountered this kind of problem.
051012 110415 Finishing update 051012 110559 Processing pagesByURL: Sorted 2702317 instructions in 103.48 seconds. 051012 110559 Processing pagesByURL: Sorted 26114.389253962116 instructions/second Exception in thread "main" java.io.IOException: key out of order: http://www.ino.com/ after http://wwwcgi.ci.boulder.co.us/calendar.pl at org.apache.nutch.io.MapFile$Writer.checkKey(MapFile.java:134) at org.apache.nutch.io.MapFile$Writer.append(MapFile.java:120) at org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:736) at org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557) at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544) at org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321) at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371) [EMAIL PROTECTED] nutch-0.7.1]# Any Help? Tanks |
No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.344 / Virus Database: 267.11.14/130 - Release Date: 12/10/2005
