You should email [EMAIL PROTECTED] list, where Nutch users "hang out".
Otis --- Christian Aschoff <[EMAIL PROTECTED]> wrote: > Hi, > > after three days of crwling the intranet, the nutch crawler throwed > an exception :-( > > It seems that the crawler wants to do something with the .DS_store- > File from Mac OS X an he does not know how to handle it? > > Can i re-initiate the clean-up without crawling the intranet again? > > Regards, > Christian > > 050906 184831 Processing document 29000 > 050906 184833 Processing document 30000 > 050906 184835 Finishing update > 050906 184845 Processing pagesByURL: Sorted 288601 instructions in > 9.954 seconds. > 050906 184845 Processing pagesByURL: Sorted 28993.46996182439 > instructions/second > 050906 184856 Processing pagesByURL: Merged to new DB containing > 181590 records in 9.095 seconds > 050906 184856 Processing pagesByURL: Merged 19965.915338097853 > records/second > 050906 184857 Processing pagesByMD5: Sorted 76461 instructions in > 1.199 seconds. > 050906 184857 Processing pagesByMD5: Sorted 63770.64220183486 > instructions/second > 050906 184904 Processing pagesByMD5: Merged to new DB containing > 181590 records in 5.738 seconds > 050906 184904 Processing pagesByMD5: Merged 31646.91530149878 > records/ > second > 050906 184911 Processing linksByMD5: Sorted 286132 instructions in > 7.354 seconds. > 050906 184911 Processing linksByMD5: Sorted 38908.34919771553 > instructions/second > 050906 184940 Processing linksByMD5: Merged to new DB containing > 1060091 records in 27.791 seconds > 050906 184940 Processing linksByMD5: Merged 38145.11892339247 > records/ > second > 050906 184943 Processing linksByURL: Sorted 145747 instructions in > 3.082 seconds. > 050906 184943 Processing linksByURL: Sorted 47289.74691758599 > instructions/second > 050906 185014 Processing linksByURL: Merged to new DB containing > 1060091 records in 29.113 seconds > 050906 185014 Processing linksByURL: Merged 36412.977020575 records/ > second > 050906 185017 Processing linksByMD5: Sorted 181123 instructions in > 2.968 seconds. > 050906 185017 Processing linksByMD5: Sorted 61025.26954177897 > instructions/second > 050906 185045 Processing linksByMD5: Merged to new DB containing > 1060091 records in 26.092 seconds > 050906 185045 Processing linksByMD5: Merged 40628.96673309827 > records/ > second > 050906 185234 Update finished > 050906 185235 Updating /Users/caschoff/Desktop/nutch-0.7/ > crawl.uni.test/segments from /Users/caschoff/Desktop/nutch-0.7/ > crawl.uni.test/db > 050906 185235 reading /Users/caschoff/Desktop/nutch-0.7/ > crawl.uni.test/segments/.DS_Store > Exception in thread "main" java.io.FileNotFoundException: /Users/ > caschoff/Desktop/nutch-0.7/crawl.uni.test/segments/.DS_Store/fetcher/ > > data > at org.apache.nutch.fs.LocalFileSystem.open > (LocalFileSystem.java:93) > at org.apache.nutch.io.SequenceFile$Reader.<init> > (SequenceFile.java:194) > at org.apache.nutch.io.SequenceFile$Reader.<init> > (SequenceFile.java:187) > at > org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:190) > at > org.apache.nutch.io.MapFile$Reader.<init>(MapFile.java:179) > at org.apache.nutch.io.ArrayFile$Reader.<init> > (ArrayFile.java:50) > at org.apache.nutch.tools.UpdateSegmentsFromDb.addSegment > (UpdateSegmentsFromDb.java:197) > at org.apache.nutch.tools.UpdateSegmentsFromDb.run > (UpdateSegmentsFromDb.java:182) > at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:147) > > [2]- Exit 1 bin/nutch crawl urls -dir > crawl.uni.test -depth 10 1>&crawl.log > > --- > Dipl. Ing. (FH) Christian Aschoff > > Büro: > Universität Ulm/KIZ > Raum O26/5403 > > Tel. 0731 50-22432 > [EMAIL PROTECTED] > > Privat: > Fabristr. 13 > 89075 Ulm > Deutschland/Old Europe > > Tel. 0731 60280360 > Fax. 0731 60280361 > [EMAIL PROTECTED] > > Helfen Sie mit: www.meyers-konversationslexikon.de > > >