[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

Anoop Sam John (JIRA) Tue, 21 Nov 2017 01:42:57 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260479#comment-16260479
 ]


Anoop Sam John commented on HBASE-18294:
----------------------------------------

Sorry I missed to add one more thing which I wanted to.  Saw some where the ask 
on how to handle the meta size used by CCM for meta data when the MSLAB is off 
heap.  That is any way off heap only and not data. I would say we can just 
consider this as data part only.  To effectively reach our data, we need this. 
It would be just ok.  What am saying is just for CCM, we no need to add more 
complexity.
bq.Reaching global pressure too frequently means the RS is not "healthy". Do we 
agree on this??
Ya reaching global barrier is not that good. Because this will make writes to 
be blocked for flushes.  But this situation depends on many factors. It may not 
mean RS is not healthy.  The more #regions in this RS, this chance is more. The 
more client pressure then also more chances. 
Say we have x as the global heap upper barrier and there are n regions. I 
consider 128 MB as flush size and 4 as blocking multiplier. Then if x>=128 * 4 
* n, we would never see the barrier breach. But this is not so advisable thing 
any way. Just saying.  Even the HDFS slowness can make the flushes to take 
longer and so barrier breaches.  So I dont think we can say barrier breaching 
is always too bad and RS is unhealthy then.
BTW am not getting how we can avoid/reduce that situation using these changes.
The arg u made abt the selection of region when there is a global size breach 
make sense fully.  And I feel we should correct it.  So tracking the heapSize 
at the Region level is needed. (Remember we track it at each Segment level and 
can always sum all segments heapSize to know this. We do this in some flow 
cases).   When there is a flush region selection because of global heap barrier 
breach, we should select region(s) which releases max heap size. 
For per region flush decision (128MB) IMHO we can continue to check against 
data size.  Or may be if the BC is a concern still, we can have check like
if ( dataSize >= 128 || heapSize >= 128 ) flush()

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

Reply via email to