[
https://issues.apache.org/jira/browse/HDFS-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaobing Zhou updated HDFS-11608:
---------------------------------
Description:
We've seen HDFS write crashes in the case of huge block size. For example,
writing a 3G file using 3G block size, HDFS client throws out of memory
exception. DataNode gives out IOException. After changing heap size limit,
DFSOutputStream ResponseProcessor exception is seen followed by Broken pipe and
pipeline recovery.
Give below:
Client out-of-mem exception,
{noformat}
17/03/30 07:13:50 WARN hdfs.DFSClient: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1245)
at java.lang.Thread.join(Thread.java:1319)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:624)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeInternal(DFSOutputStream.java:592)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hdfs.util.ByteArrayManager$NewByteArrayWithoutLimit.newByteArray(ByteArrayManager.java:308)
at
org.apache.hadoop.hdfs.DFSOutputStream.createPacket(DFSOutputStream.java:197)
at
org.apache.hadoop.hdfs.DFSOutputStream.writeChunkImpl(DFSOutputStream.java:1906)
at
org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1884)
at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206)
at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
at
org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2321)
at
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2303)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
at HdfsWriterOutputStream.run(HdfsWriterOutputStream.java:57)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at HdfsWriterOutputStream.main(HdfsWriterOutputStream.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}
Client ResponseProcessor exception,
{noformat}
17/03/30 18:20:12 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
exception for block
BP-1828245847-192.168.64.101-1490851685890:blk_1073741859_1040
java.io.EOFException: Premature EOF: no length prefix available
at
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2293)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:748)
17/03/30 18:22:32 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.hdfs.DFSPacket.writeTo(DFSPacket.java:176)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
{noformat}
DN exception,
{noformat}
2017-03-30 16:34:33,828 ERROR datanode.DataNode (DataXceiver.java:run(278)) -
c6401.ambari.apache.org:50010:DataXceiver error processing WRITE_BLOCK
operation src: /192.168.64.101:47167 dst: /192.168.64.101:50010
java.io.IOException: Incorrect value for packet payload size: 2147483128
at
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:502)
at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:898)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:806)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
at java.lang.Thread.run(Thread.java:745)
{noformat}
> HDFS write crashed in the case of huge block size
> -------------------------------------------------
>
> Key: HDFS-11608
> URL: https://issues.apache.org/jira/browse/HDFS-11608
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.8.0
> Reporter: Xiaobing Zhou
> Assignee: Xiaobing Zhou
> Priority: Critical
>
> We've seen HDFS write crashes in the case of huge block size. For example,
> writing a 3G file using 3G block size, HDFS client throws out of memory
> exception. DataNode gives out IOException. After changing heap size limit,
> DFSOutputStream ResponseProcessor exception is seen followed by Broken pipe
> and pipeline recovery.
> Give below:
> Client out-of-mem exception,
> {noformat}
> 17/03/30 07:13:50 WARN hdfs.DFSClient: Caught exception
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Thread.join(Thread.java:1245)
> at java.lang.Thread.join(Thread.java:1319)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:624)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeInternal(DFSOutputStream.java:592)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.hadoop.hdfs.util.ByteArrayManager$NewByteArrayWithoutLimit.newByteArray(ByteArrayManager.java:308)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.createPacket(DFSOutputStream.java:197)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunkImpl(DFSOutputStream.java:1906)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1884)
> at
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206)
> at
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
> at
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2321)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2303)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
> at HdfsWriterOutputStream.run(HdfsWriterOutputStream.java:57)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at HdfsWriterOutputStream.main(HdfsWriterOutputStream.java:77)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Client ResponseProcessor exception,
> {noformat}
> 17/03/30 18:20:12 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
> exception for block
> BP-1828245847-192.168.64.101-1490851685890:blk_1073741859_1040
> java.io.EOFException: Premature EOF: no length prefix available
> at
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2293)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:748)
> 17/03/30 18:22:32 WARN hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.hdfs.DFSPacket.writeTo(DFSPacket.java:176)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
> {noformat}
> DN exception,
> {noformat}
> 2017-03-30 16:34:33,828 ERROR datanode.DataNode (DataXceiver.java:run(278)) -
> c6401.ambari.apache.org:50010:DataXceiver error processing WRITE_BLOCK
> operation src: /192.168.64.101:47167 dst: /192.168.64.101:50010
> java.io.IOException: Incorrect value for packet payload size: 2147483128
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:502)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:898)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:806)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]