[
https://issues.apache.org/jira/browse/HBASE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15775859#comment-15775859
]
Anoop Sam John commented on HBASE-17338:
----------------------------------------
In case of on heap MSLAB, all the size calc doing global memstore data size +
heap overhead based. That is why was not adding the cell data size in case of
on heap. But I think I can do it bit diff way..
So instead of tracking the data size and heap overhead, we will track cell
data size and heap size. Pls see it is not heap overhead alone.. This is the
total heap space occupied by memstore(s). I can change the calc accordingly.
For on heap MSLAB , this heap space count will be the same as what we have
before off heap write path work (ie. It will include all cell data size and
overhead part). The new accounting ie. cellDataSize will include only the cell
data bytes size part. This will be a subset of the former one then. Will do
all necessary changes.. This will be a bigger patch then as I will rename all
the related area from heapOverhead to heapSize or so. I believe that way will
look cleaner. Thoughts?
bq. need to read on why Append/Increment can't be out in offheap.
Append/Increment is not adding cells into MSLAB area.. This is to avoid MSLAB
wastage. Say same cell is getting incremented 1000 times and cell key+ value
size is 100 bytes. If every increment (add to memstore) cell was added to
MSLAB, we will overall take ~100KB MSLAB space whereas only 100 bytes is valid
at any point.. All the former cells are getting deleted by the addition of a
new cell.. The Cell POJO as such is removed from CSLM. But we can not free up
that space in MSLAB.. MSLAB is not designed to do this way. That the chunk
allocation and offset allocation within a chunk is serially incremented way
happening. We can not mark some in btw space as free and reuse.. That will
make things very complex for us.. So to avoid these, the simplest way was to
NOT use MSLAB for upsert.
> Treat Cell data size under global memstore heap size only when that Cell can
> not be copied to MSLAB
> ---------------------------------------------------------------------------------------------------
>
> Key: HBASE-17338
> URL: https://issues.apache.org/jira/browse/HBASE-17338
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver
> Affects Versions: 2.0.0
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-17338.patch
>
>
> We have only data size and heap overhead being tracked globally. Off heap
> memstore works with off heap backed MSLAB pool. But a cell, when added to
> memstore, not always getting copied to MSLAB. Append/Increment ops doing an
> upsert, dont use MSLAB. Also based on the Cell size, we sometimes avoid
> MSLAB copy. But now we track these cell data size also under the global
> memstore data size which indicated off heap size in case of off heap
> memstore. For global checks for flushes (against lower/upper watermark
> levels), we check this size against max off heap memstore size. We do check
> heap overhead against global heap memstore size (Defaults to 40% of xmx) But
> for such cells the data size also should be accounted under the heap overhead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)