[ 
https://issues.apache.org/jira/browse/HDFS-11608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11608:
---------------------------------
    Description: 
We've seen HDFS write crashes in the case of huge block size. For example, 
writing a 3G file using 3G block size, HDFS client throws out of memory 
exception. DataNode gives out IOException. After changing heap size limit,  
DFSOutputStream ResponseProcessor exception is seen followed by Broken pipe and 
pipeline recovery.

Give below:
Client out-of-mem exception,
{noformat}
17/03/30 07:13:50 WARN hdfs.DFSClient: Caught exception
java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1245)
        at java.lang.Thread.join(Thread.java:1319)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:624)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeInternal(DFSOutputStream.java:592)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.hdfs.util.ByteArrayManager$NewByteArrayWithoutLimit.newByteArray(ByteArrayManager.java:308)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.createPacket(DFSOutputStream.java:197)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.writeChunkImpl(DFSOutputStream.java:1906)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1884)
        at 
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206)
        at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
        at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2321)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2303)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
        at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
        at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
        at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
        at HdfsWriterOutputStream.run(HdfsWriterOutputStream.java:57)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at HdfsWriterOutputStream.main(HdfsWriterOutputStream.java:77)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

Client ResponseProcessor exception,
{noformat}
17/03/30 18:20:12 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor 
exception  for block 
BP-1828245847-192.168.64.101-1490851685890:blk_1073741859_1040
java.io.EOFException: Premature EOF: no length prefix available
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2293)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:748)
17/03/30 18:22:32 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
        at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
        at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at org.apache.hadoop.hdfs.DFSPacket.writeTo(DFSPacket.java:176)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
{noformat}

DN exception,
{noformat}
2017-03-30 16:34:33,828 ERROR datanode.DataNode (DataXceiver.java:run(278)) - 
c6401.ambari.apache.org:50010:DataXceiver error processing WRITE_BLOCK 
operation  src: /192.168.64.101:47167 dst: /192.168.64.101:50010
java.io.IOException: Incorrect value for packet payload size: 2147483128
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:502)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:898)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:806)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

> HDFS write crashed in the case of huge block size
> -------------------------------------------------
>
>                 Key: HDFS-11608
>                 URL: https://issues.apache.org/jira/browse/HDFS-11608
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.8.0
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>            Priority: Critical
>
> We've seen HDFS write crashes in the case of huge block size. For example, 
> writing a 3G file using 3G block size, HDFS client throws out of memory 
> exception. DataNode gives out IOException. After changing heap size limit,  
> DFSOutputStream ResponseProcessor exception is seen followed by Broken pipe 
> and pipeline recovery.
> Give below:
> Client out-of-mem exception,
> {noformat}
> 17/03/30 07:13:50 WARN hdfs.DFSClient: Caught exception
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Thread.join(Thread.java:1245)
>       at java.lang.Thread.join(Thread.java:1319)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:624)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeInternal(DFSOutputStream.java:592)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.hadoop.hdfs.util.ByteArrayManager$NewByteArrayWithoutLimit.newByteArray(ByteArrayManager.java:308)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.createPacket(DFSOutputStream.java:197)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunkImpl(DFSOutputStream.java:1906)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1884)
>       at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206)
>       at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
>       at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2321)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2303)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
>       at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>       at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
>       at HdfsWriterOutputStream.run(HdfsWriterOutputStream.java:57)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>       at HdfsWriterOutputStream.main(HdfsWriterOutputStream.java:77)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Client ResponseProcessor exception,
> {noformat}
> 17/03/30 18:20:12 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor 
> exception  for block 
> BP-1828245847-192.168.64.101-1490851685890:blk_1073741859_1040
> java.io.EOFException: Premature EOF: no length prefix available
>       at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2293)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:748)
> 17/03/30 18:22:32 WARN hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Broken pipe
>       at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>       at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>       at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>       at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>       at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>       at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>       at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>       at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>       at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
>       at java.io.DataOutputStream.write(DataOutputStream.java:107)
>       at org.apache.hadoop.hdfs.DFSPacket.writeTo(DFSPacket.java:176)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
> {noformat}
> DN exception,
> {noformat}
> 2017-03-30 16:34:33,828 ERROR datanode.DataNode (DataXceiver.java:run(278)) - 
> c6401.ambari.apache.org:50010:DataXceiver error processing WRITE_BLOCK 
> operation  src: /192.168.64.101:47167 dst: /192.168.64.101:50010
> java.io.IOException: Incorrect value for packet payload size: 2147483128
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:159)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:502)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:898)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:806)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to