[
https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760395#comment-15760395
]
Anoop Sam John commented on HBASE-16421:
----------------------------------------
There is a diff btw the read path off heaping work and this one. Read path off
heaping, the start point is the off heap backed BucketCache. The block data
bytes are in off heap BBs. Even before the HBASE-11425 work also, in read
path, we always start with block bytes (64 KB default size). The data can be
read now from HDFS or from L1 or L2 cache.. In all cases there are plain bytes.
The read path first reads and creates Cells out of this.. Pls remember that we
wont do any bytes copy for this. (Specially after the 11425 work). Only thing
is Cell POJOs are created wrapping the data bytes in on heap or off heap area.
So this is bot avoidable at all. Before and after the off heaping read path
work, this part was/is same.
Here the diff is this.. As of now, we have Cell object always in Segments. The
segments can be flattened or not. Depending on that the cells might be in CSLM
or in Cell[]. But Cell pojos are there. When scan/read comes we serve back
those Cell objects. When there is a need for in memory compaction or on disk
flush, we will have a scanner associated with it and that reads out the Cells.
You can see it is just retrieval or iteration of the existing POJOs happens.
When we have CellChunkMap it will be diff.. We will get rid of Cell objects as
such. What we have instead is some index data (ChunkId + offset + length). And
for every Cell, we will have to convert this index into a POJO cell object.
Well for the actual Scan/Get working on this Segment, this is unavoidable. We
must work with Cells then. But doing the same for even and in memory
compaction/on disk flush will be too much.. We got rid of many java objects
(Cells), doing the flattening to ChunkMap and we in between, create those
objects again! This will cause so many garbage and affect GC.
I hope am explaining the diff and impact more clean now. :-) Sorry if I was not
clear earlier.
> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>
> Key: HBASE-16421
> URL: https://issues.apache.org/jira/browse/HBASE-16421
> Project: HBase
> Issue Type: Umbrella
> Reporter: Anastasia Braginsky
> Attachments: CellChunkMapRevived.pdf,
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Follow up for HBASE-14921. This is going to be the umbrella JIRA to include
> all the parts of integration of the CellChunkMap to the MemStore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)