Hi, I have 4 boxes (1 master, 3 slaves), about 33GB worth of segment data and 4.6M fetched urls in my crawldb. I'm using the mapred code from trunk (revision 374061, Wed, 01 Feb 2006). I was able to generate the indexes from the crawldb and linkdb, but I started to see this error recently while running a dedup on my indexes:
.... 060210 061707 reduce 9% 060210 061710 reduce 10% 060210 061713 reduce 11% 060210 061717 reduce 12% 060210 061719 reduce 11% 060210 061723 reduce 10% 060210 061725 reduce 11% 060210 061726 reduce 10% 060210 061729 reduce 11% 060210 061730 reduce 9% 060210 061732 reduce 10% 060210 061736 reduce 11% 060210 061739 reduce 12% 060210 061742 reduce 10% 060210 061743 reduce 9% 060210 061745 reduce 10% 060210 061746 reduce 100% Exception in thread "main" java.io.IOException: Job failed! at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:310) at org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:329) at org.apache.nutch.indexer.DeleteDuplicates.main(DeleteDuplicates.java:349) I can see a lot of these messages in the jobtracker log on the master: ... 060210 061743 Task 'task_r_4t50k4' has been lost. 060210 061743 Task 'task_r_79vn7i' has been lost. ... On every single slave, I get this file not found exception in the tasktracker log: 060210 061749 Server handler 0 on 50040 caught: java.io.FileNotFoundException: /var/epile/nutch/mapred/local/task_m_273opj/part-4.out java.io.FileNotFoundException: /var/epile/nutch/mapred/local/task_m_273opj/part-4.out at org.apache.nutch.fs.LocalFileSystem.openRaw(LocalFileSystem.java:121) at org.apache.nutch.fs.NFSDataInputStream$Checker.<init>(NFSDataInputStream.java:45) at org.apache.nutch.fs.NFSDataInputStream.<init>(NFSDataInputStream.java:226) at org.apache.nutch.fs.NutchFileSystem.open(NutchFileSystem.java:160) at org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:93) at org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:121) at org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:68) at org.apache.nutch.ipc.Server$Handler.run(Server.java:215) I used to be able to complete the index dedupping successfully when my segments/crawldb was smaller, but I don't see why this would be related to the FileNotFoundException. I'm by far not running out of disk space and my hard discs work properly. Has anyone encountered a similar issue or has a clue about what's happening? Thanks, Florent