[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259214#comment-16259214
 ] 

Eshcar Hillel commented on HBASE-18294:
---------------------------------------

[~anoopsamjohn] I hope you don't mind I'm adding your comment on the new 
off-heap flush size configuration property here since it would be easier to 
have this important discussion in the Jira rather in RB:
bq. Do we really need to add this extra flush size? It will be very difficult 
for the users to set all. They used to set the flush size. Being a normal user, 
I set some flush size means am expecting the flush will happen after the data 
reached this size.  We have implemented this as heapSize oriented in the past 
as every thing was in heap. Also we can not make the heap size of all memstores 
to grow beyond a limt. We have the global heap upper barrier and stuff like 
that.
bq. The change for 2.0 is to change the per region flush decision to be data 
size based not heap size based. I agree that is a change in the way we were 
working. So the flushes will be delayed than the prev tuned ways.
bq. Also I agree to one of ur args (In mail chain I guess) that when we select 
regions to flush because of global heap pressure, we should select the one with 
max heap size. Right now that is not happening.
bq. So this directs to the need to track the data size and heap size both at 
region level. Agree.
bq. We can still have the flush decision at region level based on the data size 
only?  When the global heap size is at barrier, we will select region based on 
the heap size. When the data size barrier breaches(In off heap case), we will 
select regions based data size. 
bq. I dont think we have to do all these of adding one more off heap flush size 
etc. Pls lets keep simple.

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to