This might be caused by the default wirte packet size. In HDFS, user data are pipeline to datanodes in packets. The default packet size is 64K. If the chunksize is bigger than 64K, the packet size automatically adjusts to include at least one chunk.
Please set the packet size to be 8MB by configuring dfs.client-write-packet-size (in trunk) and rerun your experiments. Hairong On 10/8/10 9:42 PM, "elton sky" <eltonsky9...@gmail.com> wrote: > Hello, > > I was benchmarking write/read of HDFS. > > I changed the chunksize, i.e. bytesPerChecksum or bpc, and create a 1G file > with 128MB block size. The bpc I used: 512B, 32KB, 64KB, 256KB, 512KB, 2MB, > 8MB. > > The result surprised me. The performance for 512B, 32KB, 64KB are quite > similar, and then, as the increase of the bpc size the throughput decreases. > And comparing 512B to 8MB, there's a 40% to 50% difference in throughput. > > Is there any idea for this? >