Hi , I got such exception running hadoop job: java.io.EOFException: Unexpected end of input stream at org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:99) at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:87) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75) at java.io.InputStream.read(InputStream.java:85) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:169) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:114) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.
As I understood some of my files are corrupted ( I am working with GZ format). I resolve the issue using conf.set("mapred.max.map.failures.percent" , "1"), But I don't know what file cause the problem. Question: How can I get a filename which is corrupted. Thanks in advance Oleg.