I am wondering if Nutch is really usable in the real world. As I mentioned in a early e-mail today, I had problems with something like "hot spot in virtual machine". Now I am having another kind of problem, that is the same problem that I reported about 1 month ago without any reply. I am using Nutch 0.7.1, FC-3, 1 gig ram, 4 mbits conection, 50 threads and a got the following error message:
051201 201356 Processing pagesByURL: Sorted 44235.525534441804 instructions/second Exception in thread "main" java.io.IOException: key out of order: http://neic.usgs.gov/neis/states/states.html after http:o/neic.usgs.gov/neis/states/state_largest.html at org.apache.nutch.io.MapFile$Writer.checkKey(MapFile.java:134) at org.apache.nutch.io.MapFile$Writer.append(MapFile.java:120) at org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:736) at org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557) at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544) at org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321) at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371) at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:141) [EMAIL PROTECTED] nutch-0.7.1]# I am using Nutch for more than 1 year now, but, recently, for each 5 tries, I am happy if I succeed in finishing the task just one time. I really don't kown what is going on, as a I am just an user and not a programmer. What I really know is that I am geting much more headaches than satisfaction in trying to setup anything using nutch. Tanks, Wmelo ________________________________________________ Olimpo - A sua internet ! http://www.olimpo.com.br
