please check your network issue. generally this was casued by unstable network device.
On Wed, Nov 5, 2014 at 5:55 PM, Hayden Marchant <hayd...@amobee.com> wrote: > > I have a MapReduce job running on Hadoop 2.0.0, and on some 'heavy' jobs, > I am seeing the following errors in the reducer. > > > 2014-11-04 13:30:57,761 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > BP-60005389-172.30.21.49-1379424439243:blk_-4575496846575688807_62439186 > java.io.EOFException: Premature EOF: no length prefix available at > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:114) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:695) > 2014-11-04 13:30:57,842 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block > BP-60005389-172.30.21.49-1379424439243:blk_-4575496846575688807_62439186 in > pipeline 172.30.120.143:50010, 172.30.120.186:50010: bad datanode > 172.30.120.143:50010 2014-11-04 13:33:09,707 WARN > org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor > exception for block > BP-60005389-172.30.21.49-1379424439243:blk_-4575496846575688807_62439488 > java.io.EOFException: Premature EOF: no length prefix available at > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:114) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:695) > > Once this error occurs, every time the code subsequently tries to write to > HDFS, it gets a different error: > > java.io.IOException: All datanodes 172.30.120.193:50010 are bad. > Aborting... > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:960) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) > > This error happens for EVERY write. > > btw, 172.30.21.49 is our NameNode, and 172.30.120.193 is the slave on > which this task was running. > > What should I be looking at to stop this happening? Could it be a resource > contention happening somewhere? I looked at Namenode console and we have > enough disk-space. > > Clearly, I want to avoid this happening, and would also like > recommendation of what to do if this does happen - currently, the exception > is caught and a counter is incremented. Maybe we should be throwing this > exception up so that the task is retried somewhere else. > > Any recommendations/advice are welcome. > > Thanks, > Hayden