[ https://issues.apache.org/jira/browse/HADOOP-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suganya updated HADOOP-15053: ----------------------------- Description: Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer thread was waiting. Hadoop file write takes more time for the first 5mb data write process. *Thread waits here - code* while (((!this.streamerClosed) && (!this.hasError) && (DFSOutputStream.this.dfsClient.clientRunning) && (DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage != BlockConstructionStage.DATA_STREAMING) || ((this.stage == BlockConstructionStage.DATA_STREAMING) && (now - lastPacket < DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) { long timeout = DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now - lastPacket); timeout = timeout <= 0L ? 1000L : timeout; timeout = this.stage == BlockConstructionStage.DATA_STREAMING ? timeout : 1000L; *Thread dump:* Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in Object.wait() [0x00007fcb2c7a2000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503) - locked <0x00000006c6f95fd0> (a java.util.LinkedList) *Debug logs:* - here DataStreamer for seq no 0 started after adding 80th packet in queue 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532, chunksPerPacket=127, bytesCurBlock=5266944 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968, blockSize=134217728, appendChunk=false 2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: addBlock took 34ms 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.19.76:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.9.13:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.19.70:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 172.20.19.76:50010 2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072 2090 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:0 offsetInBlock:0 lastPacketInBlock:false lastByteOffsetInBlock: 65024 2091 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false lastByteOffsetInBlock: 130048 2091 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false lastByteOffsetInBlock: 195072 2233 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false lastByteOffsetInBlock: 260096 2333 [ResponseProcessor for block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384 2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532, chunksPerPacket=127, bytesCurBlock=5331968 was: Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer thread was waiting. Hadoop file write takes more time for the first 5mb data write process. *Thread waits here - code* while (((!this.streamerClosed) && (!this.hasError) && (DFSOutputStream.this.dfsClient.clientRunning) && (DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage != BlockConstructionStage.DATA_STREAMING) || ((this.stage == BlockConstructionStage.DATA_STREAMING) && (now - lastPacket < DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) { long timeout = DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now - lastPacket); timeout = timeout <= 0L ? 1000L : timeout; timeout = this.stage == BlockConstructionStage.DATA_STREAMING ? timeout : 1000L; *Thread dump:* Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in Object.wait() [0x00007fcb2c7a2000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503) - locked <0x00000006c6f95fd0> (a java.util.LinkedList) *Debug logs:* 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532, chunksPerPacket=127, bytesCurBlock=5266944 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968, blockSize=134217728, appendChunk=false 2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: addBlock took 34ms 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.19.76:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.9.13:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.20.19.70:50010 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 172.20.19.76:50010 2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072 2090 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:0 offsetInBlock:0 lastPacketInBlock:false lastByteOffsetInBlock: 65024 2091 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false lastByteOffsetInBlock: 130048 2091 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false lastByteOffsetInBlock: 195072 2233 [DataStreamer for file /1/test/file4.txt block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DataStreamer block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false lastByteOffsetInBlock: 260096 2333 [ResponseProcessor for block BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384 2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532, chunksPerPacket=127, bytesCurBlock=5331968 > Datastreamer.run threads waits for 1s after getting outputstream > ---------------------------------------------------------------- > > Key: HADOOP-15053 > URL: https://issues.apache.org/jira/browse/HADOOP-15053 > Project: Hadoop Common > Issue Type: Bug > Components: performance > Affects Versions: 2.5.1 > Reporter: Suganya > Priority: Critical > > Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer > thread was waiting. > Hadoop file write takes more time for the first 5mb data write process. > *Thread waits here - code* > while (((!this.streamerClosed) && (!this.hasError) && > (DFSOutputStream.this.dfsClient.clientRunning) && > (DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage != > BlockConstructionStage.DATA_STREAMING) || ((this.stage == > BlockConstructionStage.DATA_STREAMING) && (now - lastPacket < > DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) { > long timeout = > DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now - > lastPacket); > timeout = timeout <= 0L ? 1000L : timeout; > timeout = this.stage == BlockConstructionStage.DATA_STREAMING ? > timeout : 1000L; > > *Thread dump:* > Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in > Object.wait() [0x00007fcb2c7a2000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503) > - locked <0x00000006c6f95fd0> (a java.util.LinkedList) > *Debug logs:* - here DataStreamer for seq no 0 started after adding 80th > packet in queue > 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80 > 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: > src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 > 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk > allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532, > chunksPerPacket=127, bytesCurBlock=5266944 > 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk > packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968, > blockSize=134217728, appendChunk=false > 2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: > addBlock took 34ms > 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = > 172.20.19.76:50010 > 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = > 172.20.9.13:50010 > 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = > 172.20.19.70:50010 > 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to > datanode 172.20.19.76:50010 > 2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072 > 2090 [DataStreamer for file /1/test/file4.txt block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG > org.apache.hadoop.hdfs.DFSClient - DataStreamer block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending > packet packet seqno:0 offsetInBlock:0 lastPacketInBlock:false > lastByteOffsetInBlock: 65024 > 2091 [DataStreamer for file /1/test/file4.txt block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG > org.apache.hadoop.hdfs.DFSClient - DataStreamer block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending > packet packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false > lastByteOffsetInBlock: 130048 > 2091 [DataStreamer for file /1/test/file4.txt block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG > org.apache.hadoop.hdfs.DFSClient - DataStreamer block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending > packet packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false > lastByteOffsetInBlock: 195072 > 2233 [DataStreamer for file /1/test/file4.txt block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG > org.apache.hadoop.hdfs.DFSClient - DataStreamer block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending > packet packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false > lastByteOffsetInBlock: 260096 > 2333 [ResponseProcessor for block > BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG > org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS > status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384 > 2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81 > 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize: > src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532 > 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk > allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532, > chunksPerPacket=127, bytesCurBlock=5331968 -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org