Have you tried hdfs fsck command to try and catch any inconsistencies with that block? On 16 Mar 2015 19:39, "Shipper, Jay [USA]" <shipper_...@bah.com> wrote:
> On a Hadoop 2.4.0 cluster, I have a job running that's encountering the > following warnings in one of its map tasks (IPs changed, but otherwise, > this is verbatim): > > --- > 2015-03-16 06:59:37,994 WARN [ResponseProcessor for block > BP-437460642-10.0.0.1-1391018641114:blk_1084609656_11045296] > org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor > exception for block > BP-437460642-10.0.0.1-1391018641114:blk_1084609656_11045296 > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1990) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:796) > 2015-03-16 06:59:37,994 WARN [ResponseProcessor for block > BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295] > org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor > exception for block > BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295 > java.io.IOException: Bad response ERROR for block > BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295 from datanode > 10.0.0.1:1019 > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819) > --- > > This job is launched from Hive 0.13.0, and it's consistently happening > on the same split, which is on a sequence file. After logging a few errors > like the above, the map task seems to make no progress and eventually times > out (with a mapreduce.task.timeout value greater than 5 hours). > > Any pointers on how to begin troubleshooting and resolving this issue? > In searching around, it was suggested that this is indicative of a "network > issue", but as it happens on the same split consistently, that explanation > seems unlikely. >