[ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024227#comment-17024227
 ] 

Anoop Sam John commented on HBASE-23279:
----------------------------------------

In BC the bucket size selection should be based on the size of the HFile 
blocks. We have 65 KB buckets so as to accommodate 64KB blocks. (Considering 
extra header size etc needs per blocks and possibly last cell's overflow).  
In case of DBE or non DBE we try to make sure we stay within this block size 
while writing files. We track the sizes and have check on when to close current 
block being written.
In case of Row Index this tracking is not accounting the row offsets index 
being written. Because on the go these are kept in memory only and written at 
the end of block write. This end block decision was based on size check only 
which did NOT include this offsets index.  Unless we fix this issue, we should 
not enable this globally.  
If the size of the blocks comes as more than 65KB because of this index, then 
we wont be able to fix it to this bucket and instead will go to next bucket and 
so wasting lot of cache memory?
So its not ONLY about less blocks which can get cached in same BC size.  
Hope it is clear by now. 

> Switch default block encoding to ROW_INDEX_V1
> ---------------------------------------------
>
>                 Key: HBASE-23279
>                 URL: https://issues.apache.org/jira/browse/HBASE-23279
>             Project: HBase
>          Issue Type: Wish
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Lars Hofhansl
>            Assignee: Viraj Jasani
>            Priority: Minor
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: HBASE-23279.master.000.patch, 
> HBASE-23279.master.001.patch, HBASE-23279.master.002.patch, 
> HBASE-23279.master.003.patch, HBASE-23279.master.004.patch, 
> HBASE-23279.master.005.patch
>
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to