Mathijs Homminga wrote: > Hi Andrzej, > > The job stopped because there was no space left on the disk: > > FATAL fetcher.Fetcher - org.apache.hadoop.fs.FSError: > java.io.IOException: No space left on device > FATAL fetcher.Fetcher - at > org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(LocalFileSystem.java:150) > > > FATAL fetcher.Fetcher - at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:112) > > > > We use a local FS. Temporary data is stored in /tmp/hadoop/mapred/
Ok, in your case this partial data may be recoverable, but with some manual work involved ... At this stage, I'm assuming that even if you started the reduce phase its output won't be usable at all. So, we need to start from the data contained in partial map outputs. Map outputs are a set of SequenceFile's containing pairs of <Text, FetcherOutput> data. Umm, forgot to ask you - are you running trunk/ or Nutch 0.8 ? If trunk, then use the Text class, if 0.8 - replace all occurrences of Text with UTF8. This is such a common problem that I created a special tool to address this - please see http://issues.apache.org/jira/browse/NUTCH-451 . Let me repeat what the javadoc says, so that there's no misunderstanding: if you use DFS and your fetch job is aborted, there is no way in the world to recover the data - it's permanently lost. If you run with a local FS, you can try this tool and hope for the best. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
