[
https://issues.apache.org/jira/browse/HADOOP-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Suganya updated HADOOP-15053:
-----------------------------
Description:
Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer
thread was waiting.
Hadoop file write takes more time for the first 5mb data write process.
*Thread waits here - code*
while (((!this.streamerClosed) && (!this.hasError) &&
(DFSOutputStream.this.dfsClient.clientRunning) &&
(DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage !=
BlockConstructionStage.DATA_STREAMING) || ((this.stage ==
BlockConstructionStage.DATA_STREAMING) && (now - lastPacket <
DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) {
long timeout =
DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now - lastPacket);
timeout = timeout <= 0L ? 1000L : timeout;
timeout = this.stage == BlockConstructionStage.DATA_STREAMING ?
timeout : 1000L;
*Thread dump:*
Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in
Object.wait() [0x00007fcb2c7a2000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503)
- locked <0x00000006c6f95fd0> (a java.util.LinkedList)
*Debug logs:* - here DataStreamer for seq no 0 started after adding 80th packet
in queue
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532,
chunksPerPacket=127, bytesCurBlock=5266944
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968,
blockSize=134217728, appendChunk=false
2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: addBlock
took 34ms
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.19.76:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.9.13:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.19.70:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to
datanode 172.20.19.76:50010
2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072
2090 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:0 offsetInBlock:0 lastPacketInBlock:false lastByteOffsetInBlock:
65024
2091 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false
lastByteOffsetInBlock: 130048
2091 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false
lastByteOffsetInBlock: 195072
2233 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false
lastByteOffsetInBlock: 260096
2333 [ResponseProcessor for block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS status:
SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384
2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81
2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532,
chunksPerPacket=127, bytesCurBlock=5331968
was:
Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer
thread was waiting.
Hadoop file write takes more time for the first 5mb data write process.
*Thread waits here - code*
while (((!this.streamerClosed) && (!this.hasError) &&
(DFSOutputStream.this.dfsClient.clientRunning) &&
(DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage !=
BlockConstructionStage.DATA_STREAMING) || ((this.stage ==
BlockConstructionStage.DATA_STREAMING) && (now - lastPacket <
DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) {
long timeout =
DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now - lastPacket);
timeout = timeout <= 0L ? 1000L : timeout;
timeout = this.stage == BlockConstructionStage.DATA_STREAMING ?
timeout : 1000L;
*Thread dump:*
Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in
Object.wait() [0x00007fcb2c7a2000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503)
- locked <0x00000006c6f95fd0> (a java.util.LinkedList)
*Debug logs:*
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532,
chunksPerPacket=127, bytesCurBlock=5266944
1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968,
blockSize=134217728, appendChunk=false
2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: addBlock
took 34ms
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.19.76:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.9.13:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
172.20.19.70:50010
2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to
datanode 172.20.19.76:50010
2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072
2090 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:0 offsetInBlock:0 lastPacketInBlock:false lastByteOffsetInBlock:
65024
2091 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false
lastByteOffsetInBlock: 130048
2091 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false
lastByteOffsetInBlock: 195072
2233 [DataStreamer for file /1/test/file4.txt block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DataStreamer block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending packet
packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false
lastByteOffsetInBlock: 260096
2333 [ResponseProcessor for block
BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS status:
SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384
2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81
2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532,
chunksPerPacket=127, bytesCurBlock=5331968
> Datastreamer.run threads waits for 1s after getting outputstream
> ----------------------------------------------------------------
>
> Key: HADOOP-15053
> URL: https://issues.apache.org/jira/browse/HADOOP-15053
> Project: Hadoop Common
> Issue Type: Bug
> Components: performance
> Affects Versions: 2.5.1
> Reporter: Suganya
> Priority: Critical
>
> Hadoop datastream thread runs after 80th packet (5mb). till then datastreamer
> thread was waiting.
> Hadoop file write takes more time for the first 5mb data write process.
> *Thread waits here - code*
> while (((!this.streamerClosed) && (!this.hasError) &&
> (DFSOutputStream.this.dfsClient.clientRunning) &&
> (DFSOutputStream.this.dataQueue.size() == 0) && ((this.stage !=
> BlockConstructionStage.DATA_STREAMING) || ((this.stage ==
> BlockConstructionStage.DATA_STREAMING) && (now - lastPacket <
> DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2)))) || (doSleep)) {
> long timeout =
> DFSOutputStream.this.dfsClient.getConf().socketTimeout / 2 - (now -
> lastPacket);
> timeout = timeout <= 0L ? 1000L : timeout;
> timeout = this.stage == BlockConstructionStage.DATA_STREAMING ?
> timeout : 1000L;
>
> *Thread dump:*
> Thread-9" #32 daemon prio=5 os_prio=0 tid=0x00007fcb79401800 nid=0x2c1b in
> Object.wait() [0x00007fcb2c7a2000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:503)
> - locked <0x00000006c6f95fd0> (a java.util.LinkedList)
> *Debug logs:* - here DataStreamer for seq no 0 started after adding 80th
> packet in queue
> 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 80
> 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
> src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
> 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
> allocating new packet seqno=81, src=/1/test/file4.txt, packetSize=65532,
> chunksPerPacket=127, bytesCurBlock=5266944
> 1646 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
> packet full seqno=81, src=/1/test/file4.txt, bytesCurBlock=5331968,
> blockSize=134217728, appendChunk=false
> 2022 [Thread-9] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call:
> addBlock took 34ms
> 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
> 172.20.19.76:50010
> 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
> 172.20.9.13:50010
> 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline =
> 172.20.19.70:50010
> 2022 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to
> datanode 172.20.19.76:50010
> 2048 [Thread-9] DEBUG org.apache.hadoop.hdfs.DFSClient - Send buf size 131072
> 2090 [DataStreamer for file /1/test/file4.txt block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
> org.apache.hadoop.hdfs.DFSClient - DataStreamer block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending
> packet packet seqno:0 offsetInBlock:0 lastPacketInBlock:false
> lastByteOffsetInBlock: 65024
> 2091 [DataStreamer for file /1/test/file4.txt block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
> org.apache.hadoop.hdfs.DFSClient - DataStreamer block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending
> packet packet seqno:1 offsetInBlock:65024 lastPacketInBlock:false
> lastByteOffsetInBlock: 130048
> 2091 [DataStreamer for file /1/test/file4.txt block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
> org.apache.hadoop.hdfs.DFSClient - DataStreamer block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending
> packet packet seqno:2 offsetInBlock:130048 lastPacketInBlock:false
> lastByteOffsetInBlock: 195072
> 2233 [DataStreamer for file /1/test/file4.txt block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
> org.apache.hadoop.hdfs.DFSClient - DataStreamer block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350 sending
> packet packet seqno:3 offsetInBlock:195072 lastPacketInBlock:false
> lastByteOffsetInBlock: 260096
> 2333 [ResponseProcessor for block
> BP-2107533656-172.20.14.104-1483595560691:blk_1074141603_401350] DEBUG
> org.apache.hadoop.hdfs.DFSClient - DFSClient seqno: 0 status: SUCCESS
> status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 6015384
> 2333 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Queued packet 81
> 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - computePacketChunkSize:
> src=/1/test/file4.txt, chunkSize=516, chunksPerPacket=127, packetSize=65532
> 2334 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - DFSClient writeChunk
> allocating new packet seqno=82, src=/1/test/file4.txt, packetSize=65532,
> chunksPerPacket=127, bytesCurBlock=5331968
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]