[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305403#comment-16305403
 ] 

Anoop Sam John commented on HBASE-18294:
----------------------------------------

So when the Cell is added to Segment, the MemStoreSize  will have the dataSize 
and the totalOccupancy  (this is dataSize + overhead).   This MemstoreSize will 
be used at Region level o update the local counters there and then passed to 
global level RSAccounting also..  So there also same way it is updated.   Now 
at global level we have 2 checks.  We have on heap upper barrier (def 40% of 
Xmx) and off heap barrier which is specified by user. The check is any of the 
barrier breach..    The code for checking is
{code}
public FlushType isAboveHighWaterMark() {
    // for onheap memstore we check if the global memstore size and the
    // global heap overhead is greater than the global memstore limit
    if (memType == MemoryType.HEAP) {
      if (getGlobalMemStoreHeapSize() >= globalMemStoreLimit) {
        return FlushType.ABOVE_ONHEAP_HIGHER_MARK;
      }
    } else {
      // If the configured memstore is offheap, check for two things
      // 1) If the global memstore data size is greater than the configured
      // 'hbase.regionserver.offheap.global.memstore.size'
      // 2) If the global memstore heap size is greater than the configured 
onheap
      // global memstore limit 'hbase.regionserver.global.memstore.size'.
      // We do this to avoid OOME incase of scenarios where the heap is 
occupied with
      // lot of onheap references to the cells in memstore
      if (getGlobalMemStoreDataSize() >= globalMemStoreLimit) {
        // Indicates that global memstore size is above the configured
        // 'hbase.regionserver.offheap.global.memstore.size'
        return FlushType.ABOVE_OFFHEAP_HIGHER_MARK;
      } else if (getGlobalMemStoreHeapSize() >= this.globalOnHeapMemstoreLimit) 
{
        // Indicates that the offheap memstore's heap overhead is greater than 
the
        // configured 'hbase.regionserver.global.memstore.size'.
        return FlushType.ABOVE_ONHEAP_HIGHER_MARK;
      }
    }
    return FlushType.NORMAL;
  }
{code}
Now for off heap the Xmx will be way lower.  Now as per this change suggested, 
the heapSize will include the dataSize and overhead. So it will make the on 
heap barrier breach too easy and too often.  Or else we will have to do a minus 
and track overhead here.  (overhead = heapSize - dataSize).    But again the 
issue is even when the off heap is in use, some times the some cells can be in 
on heap..  Those are bigger cells which can not be cloned to MSLAB or upserted 
cells or so on..  Knowing that at RSAccounting layer is not possible.. Or else 
the MemstoreSize has to track all diff..    On all these my simple Q is this
Any way the off heap is not a default used memstore now. If one uses that it 
will be after proper test and tune.  The delayed flush in case of off heap can 
be avoided by reducing the flush Size (say 100 MB instead of 128 MB)..   Agree 
the tuning is bit tedious. But for existing users, no case that off heap will 
get in use.  We can even make changes for off heap flushes after 2.0 also..   
So my doubt always was why not keep things simple.   
if u are ready for more involved change, we can correct it.. I can help...  If 
NOT that the case, lets keep things most simple.. This is my call on this.  


> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to