[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304469#comment-16304469
 ] 

Anoop Sam John commented on HBASE-18294:
----------------------------------------

bq.we make flush decisions based on heap-occupancy instead of data size; same 
care for on- and off-heap cases
This is what still am not getting/convinced.  I had gone through the patch 
quickly.  We get the cells data and heap size in MemstoreSize after it is been 
added to Segment.  There u can see that the sizes are been calculated using 
cell's length and its heap size.
{code}
long heapSize = heapSizeChange(cellToAdd, succ);
    incSize(cellSize, heapSize);
    if (memstoreSizing != null) {
      memstoreSizing.incMemStoreSize(cellSize, heapSize);
    }
...
protected long heapSizeChange(Cell cell, boolean succ) {
    if (succ) {
      return ClassSize
          .align(indexEntrySize() + PrivateCellUtil.estimatedHeapSizeOf(cell));
    }
    return 0;
  }
...
public static long estimatedHeapSizeOf(final Cell cell) {
    if (cell instanceof HeapSize) {
      return ((HeapSize) cell).heapSize();
    }
    // TODO: Add sizing of references that hold the row, family, etc., arrays.
    return estimatedSerializedSizeOf(cell);
  }
{code}

When off heap, we have ByteBufferKV cell objects.  And its heapSize is been 
implemented as 
{code}
public long heapSize() {
    if (this.buf.hasArray()) {
      return ClassSize.align(FIXED_OVERHEAD + length);
    }
    return ClassSize.align(FIXED_OVERHEAD);
  }
{code}
This is correct also.  When off heap, the data size comes with the off heap 
occupancy and the heap size is having the overhead value only.  The cell key 
and value size (which are in DBB) is NOT added to heap size. This is correct 
also.   So now if the decision at the Region level is only based on heap size, 
this will be an issue in off heap case.    Global accounting ya seems fine. No 
changes.  Pls correct me where am getting thing wrong.

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to