I need to indentify the bottleneck of my current cluster when running
io-bound benchmarks.
I run a test with 4 nodes,  1 node as job tracker and namenode, 3 nodes as
task tracker and data nodes. 
I run RandomWriter to generate 30G data with 15 mappers, and then run Sort
on the generated data with 15 reducers. Replication is 3.
I add log code into HDFS code and then analyze the generated log for
randomwriter period and sort period. 
The result is as follows. 
I measured the following values:
1) average block preparation time: the time DFSClient spent to generate all
packets for a block. 
2) average block writing time: from the time DFSClient gets an allocated
block from namenode in nextBlockOutputStream() to all Acks are received.
3) average network receiving time and average disk writing time for a block
for the first, second, third datanode in the pipeline.

RandomWriter
Total 528 blocks,  total size is 34063336366, full blocks(64M): 506
Average block preparation time by client is: 11456.21
Average writing time for one block(64M): 11931.49
Average time on No.0  target datanode:
  average network receiving time :112.44
  average disk writing time      :3035.04
Average time on No.1  target datanode:
  average network receiving time :3337.68
  average disk writing time      :2950.74
Average time on No.2  target datanode:
  average network receiving time :3171.18
  average disk writing time      :2646.38
 

sort
Total 494 blocks,  total size is 32318504139, full blocks(64M): 479
Average block preparation time by client is: 16237.59
Average writing time for one block(64M): 16642.67
Average time on No.0  target datanode:
  average network receiving time :164.28
  average disk writing time      :3331.50
Average time on No.1  target datanode:
  average network receiving time :2125.62
  average disk writing time      :3436.32
Average time on No.2  target datanode:
  average network receiving time :2856.56
  average disk writing time      :3426.04

And my question is why the network receiving time on the third node is
larger than the other two nodes. Another question, how to identify the
bottlenecks? Or You can tell me what other kinds of values should be
collected. 

Thanks in advance.
-- 
View this message in context: 
http://www.nabble.com/Question-on-HDFS-write-performance-tp23814528p23814528.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reply via email to