[ 
https://issues.apache.org/jira/browse/HBASE-22483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850519#comment-16850519
 ] 

Anoop Sam John commented on HBASE-22483:
----------------------------------------

Good one..  Ya now we try to read the block data into these pooled BBs, it 
would be better to consider the extra.  In BC also, the size of the buckets are 
chosen like +1KB extra (4+1, 8+1...  64+1,.....  512+1)..  Adding  1 KB extra 
is very much fine.

> Maybe it's better to use 65KB as the default buffer size in ByteBuffAllocator
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-22483
>                 URL: https://issues.apache.org/jira/browse/HBASE-22483
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: 121240.stack, BucketCacheWriter-is-busy.png, 
> checksum-stacktrace.png
>
>
> There're some reason why it's better to choose 64KB as the default buffer 
> size: 
> 1. Almost all of the data block have a block size: 64KB + delta, whose delta 
> is very small, depends on the size of lastKeyValue. If we use the default 
> hbase.ipc.server.allocator.buffer.size=64KB, then each block will be 
> allocated as a MultiByteBuff: one 64KB DirectByteBuffer and delta bytes 
> HeapByteBuffer, the HeapByteBuffer will increase the GC pressure. Ideally, we 
> should let the data block to be allocated as a SingleByteBuff, it has simpler 
> data structure, faster access speed, less heap usage... 
> 2. In my benchmark, I found some checksum stack traces . (see 
> [checksum-stacktrace.png 
> |https://issues.apache.org/jira/secure/attachment/12969905/checksum-stacktrace.png])
>  
>  Since the block are MultiByteBuff, so we have to calculate the checksum by 
> an temp heap copying ( see HBASE-21917), while if we're a SingleByteBuff, we 
> can speed the checksum by calling the hadoop' checksum in native lib, it's 
> more faster.
> 3. Seems the BucketCacheWriters were always busy because of the higher cost 
> of copying from MultiByteBuff to DirectByteBuffer.  For SingleByteBuff, we 
> can just use the unsafe array copying while for MultiByteBuff we have to copy 
> byte one by one.
> Anyway, I will give a benchmark for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to