hello folks, I can see from the design doc of HDFS, says: client will buffer a block size worth of data before contacting namenode for data node info. This is a network throughput optimal way. However, I could not find this buffer processing procedure in source code.
In DFSClient.DataStreamer, it waits for dataqueue to be not empty and starts to request namenode and build a pipeline. The number of packets in the dataqueue is always 1 when this happens! I am confused here. Can anyone address this if I am wrong?