hairong Kuang wrote:
Two main reasons caused the performance decrease:
1. NNBench sets the block size to be 1. Althouth it generates a file with
only 1 byte, but the file's checksum file has 16 bytes (12 bytes header
plus 4 bytes checksums). Without the checksum file, only 1 block needs to be
generated. With the checksum file, 17 blocks need to be generated. So the
overhead of generating a checksum file is huge in this special case.
Hadoop-1134 should help a lot for this.
thanks. The numbers now make sense.
Raghu.