You the man @liang xie. Let me try your suggestion here on my little test bench. Lets get the below into refguide also.... St.Ack
On Mon, Jul 28, 2014 at 8:03 PM, 谢良 <[email protected]> wrote: > The default dfs.client-write-packet-size value is 64k, at least it's in my > Hadoop2 env. > I did a benchmark about i via ycsb loading 2 million records(3*200 bytes): > 1) dfs.client-write-packet-size=64k ygc count:399, ygct:4.208s > 2) dfs.client-write-packet-size=8k ygc count:163, ygct:2.644s > you see, it's about 40% benefit on gct:) > It's because: in DFSOutputStream.Packet class, each "Create a new packet" > operation, > will call "buf = new byte[PacketHeader.PKT_MAX_HEADER_LEN + pktSize];", > here "pktSize" comes from dfs.client-write-packet-size setting, and in > HBase write scenario, > we sync WAL asap, so all the new packets are very small > (in my ycsb testing, most of them were only hundreds of bytes, or a few > kilo bytes), > rarely reached to 64k, so always allocating 64k array is just a waste. > It would be better that if we add it to refguide note:) > > ps; 8k just a test setting, we should set it according the real kv size > pattern. > > Thanks, >
