Hi,
I am using Nutch/Hadoop with single node mode.Nutch failed to generate a new 
segement and in the hadoop log I find 
the error message below:

007-10-12 11:09:53,961 INFO  crawl.Generator - Generator: jobtracker is 
'local', generating exactly one partition.
2007-10-12 11:09:58,602 WARN  fs.FileSystem - Moving bad file 
/nutch/youjiDB/crawldb/current/part-00000/data to 
/nutch/bad_files/data.-934992143
2007-10-12 11:09:58,607 WARN  mapred.LocalJobRunner - job_2daorz
java.lang.NullPointerException
        at 
org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java:74)
        at 
org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
        at 
org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:221)
        at 
org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
        at 
org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)       
 at java.io.DataInputStream.readFully(DataInputStream.java:178)        at 
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:57)    
    at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:91)
@       

Any one can tell me how to recover the corrupted file ?

Thanks
-Qi

Reply via email to