Hi , I got such exception running hadoop job:

java.io.EOFException: Unexpected end of input stream at
org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:99)
at
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:87)
at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
at java.io.InputStream.read(InputStream.java:85) at
org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205) at
org.apache.hadoop.util.LineReader.readLine(LineReader.java:169) at
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:114)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at
org.apache.hadoop.mapred.Child$4.run(Child.

As I understood some of my files are corrupted ( I am working with GZ
format).

I resolve the issue using conf.set("mapred.max.map.failures.percent" , "1"),

But I don't know what file cause the problem.

Question:
 How can I get a filename which is corrupted.

Thanks in advance
Oleg.

Reply via email to