On a Hadoop 2.4.0 cluster, I have a job running that's encountering the 
following warnings in one of its map tasks (IPs changed, but otherwise, this is 
verbatim):

---
2015-03-16 06:59:37,994 WARN [ResponseProcessor for block 
BP-437460642-10.0.0.1-1391018641114:blk_1084609656_11045296] 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block BP-437460642-10.0.0.1-1391018641114:blk_1084609656_11045296
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1990)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:796)
2015-03-16 06:59:37,994 WARN [ResponseProcessor for block 
BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295] 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295
java.io.IOException: Bad response ERROR for block 
BP-437460642-10.0.0.1-1391018641114:blk_1084609655_11045295 from datanode 
10.0.0.1:1019
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
---

This job is launched from Hive 0.13.0, and it's consistently happening on the 
same split, which is on a sequence file.  After logging a few errors like the 
above, the map task seems to make no progress and eventually times out (with a 
mapreduce.task.timeout value greater than 5 hours).

Any pointers on how to begin troubleshooting and resolving this issue?  In 
searching around, it was suggested that this is indicative of a "network 
issue", but as it happens on the same split consistently, that explanation 
seems unlikely.

Reply via email to