Hi,


My fetch cycle (nutch fetch ./segments/20121021205343/ -threads 25) failed, 
after 3 days, with the error below. Under the segment folder 
(./segments/20121021205343/) there is only generated fetch list 
(crawl_generate) and no content. However /tmp/hadoop-myuser/ has 96G of data. I 
was wondering if there is a way to recover this data and parse the segment?

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for output/file.out

        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
        at 
org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:69)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1640)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1323)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
2012-10-24 14:43:29,671 ERROR fetcher.Fetcher - Fetcher: java.io.IOException: 
Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1318)
        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1354)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1327)


Thanks,
Mohammad

Reply via email to