I am wondering if Nutch is really usable in the real world. As I mentioned in a early e-mail today, I had problems with something like "hot spot in virtual machine". Now I am having another kind of problem, that is the same problem that I reported about 1 month ago without any reply. I am using Nutch 0.7.1, FC-3, 1 gig ram, 4 mbits conection, 50 threads and a got the following error message:
051201 201356 Processing pagesByURL: Sorted 44235.525534441804 instructions/second Exception in thread "main" java.io.IOException: key out of order: http://neic.usgs.gov/neis/states/states.html after http:o/neic.usgs.gov/neis/states/state_largest.html at org.apache.nutch.io.MapFile$Writer.checkKey(MapFile.java:134) at org.apache.nutch.io.MapFile$Writer.append(MapFile.java:120) at org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:736) at org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557) at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544) at org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321) at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371) at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:141) [EMAIL PROTECTED] nutch-0.7.1]# I am using Nutch for more than 1 year now, but, recently, for each 5 tries, I am happy if I succeed in finishing the task just one time. I really don't kown what is going on, as a I am just an user and not a programmer. What I really know is that I am geting much more headaches than satisfaction in trying to setup anything using nutch. Tanks, Wmelo ________________________________________________ Olimpo - A sua internet ! http://www.olimpo.com.br ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
