Hi,
I am using Nutch/Hadoop with single node mode.Nutch failed to generate a new
segement and in the hadoop log I find
the error message below:
007-10-12 11:09:53,961 INFO crawl.Generator - Generator: jobtracker is
'local', generating exactly one partition.
2007-10-12 11:09:58,602 WARN fs.FileSystem - Moving bad file
/nutch/youjiDB/crawldb/current/part-00000/data to
/nutch/bad_files/data.-934992143
2007-10-12 11:09:58,607 WARN mapred.LocalJobRunner - job_2daorz
java.lang.NullPointerException
at
org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java:74)
at
org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
at
org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:221)
at
org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
at
org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.DataInputStream.readFully(DataInputStream.java:178) at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:57)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:91)
@
Any one can tell me how to recover the corrupted file ?
Thanks
-Qi