[jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore

Anoop Sam John (JIRA) Sun, 18 Dec 2016 23:14:44 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760395#comment-15760395
 ]


Anoop Sam John commented on HBASE-16421:
----------------------------------------

There is a diff btw the read path off heaping work and this one.  Read path off 
heaping, the start point is the off heap backed BucketCache. The block data 
bytes are in off heap BBs. Even before the HBASE-11425 work also,  in read 
path, we always start with block bytes (64 KB default size).  The data can be 
read now from HDFS or from L1 or L2 cache.. In all cases there are plain bytes. 
 The read path first reads and creates Cells out of this.. Pls remember that we 
wont do any bytes copy for this. (Specially after the 11425 work).  Only thing 
is Cell POJOs are created wrapping the data bytes in on heap or off heap area.  
So this is bot avoidable at all.  Before and after the off heaping read path 
work, this part was/is same.

Here the diff is this..  As of now, we have Cell object always in Segments. The 
segments can be flattened or not. Depending on that the cells might be in CSLM 
or in Cell[].  But Cell pojos are there. When scan/read comes we serve back 
those Cell objects. When there is a need for in memory compaction or on disk 
flush, we will have a scanner associated with it and that reads out the Cells. 
You can see it is just retrieval or iteration of the existing POJOs happens.  
When we have CellChunkMap it will be diff.. We will get rid of Cell objects as 
such.  What we have instead is some index data (ChunkId + offset + length). And 
for every Cell, we will have to convert this index into a POJO cell object. 
Well for the actual Scan/Get working on this Segment, this is unavoidable. We 
must work with Cells then.  But doing the same for even and in memory 
compaction/on disk flush will be too much..  We got rid of many java objects 
(Cells), doing the flattening to ChunkMap and we in between, create those 
objects again!  This will cause so many garbage and affect GC.
I hope am explaining the diff and impact more clean now. :-) Sorry if I was not 
clear earlier.

> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-16421
>                 URL: https://issues.apache.org/jira/browse/HBASE-16421
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Anastasia Braginsky
>         Attachments: CellChunkMapRevived.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Follow up for HBASE-14921. This is going to be the umbrella JIRA to include 
> all the parts of integration of the CellChunkMap to the MemStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore

Reply via email to