[jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore

Anastasia Braginsky (JIRA) Tue, 20 Dec 2016 04:01:43 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764042#comment-15764042
 ]


Anastasia Braginsky commented on HBASE-16421:
---------------------------------------------

A collective answer to part of the issues raised after the road-map publishing:
--------------------------------------------------------------------------------------------------------------
[~ram_krish]:

bq. We may need the new type of Cell which has the chunk id in it?

This is a possibility. We may have ChunkCell and HeapCell derived from Cell. 
What about putting the Chunk ID first integer on each chunk’s byte buffer? Then 
each cell that knows its offset and byte buffer can just read it from there and 
return. The Cell that has no underline MSLAB chunk can return -1 as Chunk ID. 
What do you think?

bq. We have an internal branch which was doing the Pipeline flushing and 
creating n number of segments per snapshot. I could use that for now to test 
this. But if you need to test in latest trunk - can you prepare a patch with 
CellChunkMap and integrate it with the current trunk? I can give some patches 
on the #2 subtask for creating chunk id and having a cell with chunk id.
Atleast from our earlier reports one thing is sure that we do create garbage 
during flush for the cell creation but the overall impact of GC was much 
better. So I think we are benefited there, but with the scan perf I think we 
have not done any tests. For now I can do it with our internal branch but not 
on latest trunk.

It is OK that your evaluation will not be on the latest trunk what important is 
that chunks will be off-heap. To integrate CellChunkMap into the current trunk 
is all what need to be done in the task number 2, not a small issue, better not 
to do it as a prerequisite for the prerequisite. I think your patch should be 
good enough if it uses off-heap. When you say: “I can give some patches on the 
#2 subtask for creating chunk id and having a cell with chunk id” do you mean 
#2 among prerequisites or #2 among road-map tasks? I should actually number 
them anyhow different :)

--------------------------------------------------------------------------------------------------------------
[~stack]:

bq. Sorry... prob. w/ upserted cells is? Why would they not be allocated on 
MSLAB?

Our last meeting we talked about cells upserted/updated by the append/increment 
operations, which are not allocated on MSLAB. Generally any cell (small enough 
to fit the regular chunk) that are not allocated on the MSLAB, although 
generally MSLAB is enabled. 

bq. Do we think these allocations long-lived? That they will migrate to 
permanent heap?

The live length of those chunks depends on the live length of the cell for 
which this variable-size chunk is allocated. Under “permanent heap” do you mean 
the JVM’s non-heap Permanent Generation area? If so, then I do not think 
something allocated dynamically can ever move to permanent heap. It should be 
only for JVM’s metadata and statics. But may be I am missing something.

--------------------------------------------------------------------------------------------------------------
[~anoop.hbase]:

bq. A way to flush (to disk) chunk mapped segment directly with NO need to 
again make on heap Cell objects.. This is going to a big change I guess. The 
entire flush path work based on a scanner and that path need Cells.

Generally I agree it would be better to flush without creating Cell objects. 
But if this is a critical item, then how all other scans performance should be? 
I mean, after all, flush uses the same scan as others. All those paths need 
Cells and after all the flush-scan is less frequent I think. If we generally 
think we need “A way to *scan* chunk mapped segment directly with NO need to 
again make on heap Cell objects”, then this is a big issue indeed. This is why 
we need scan evaluation and if the impact is big, we need to rethink the entire 
issue again.

bq. Same way as above for the in memory compaction of 1+ chunk mapped segments.

Please pay attention that we do not plan to do memory compaction (EAGER one) 
when CellChunkMap segments are used. CellChunkMap must go with MSLAB and 
In-Memory-Compaction must go without MSLAB...

> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-16421
>                 URL: https://issues.apache.org/jira/browse/HBASE-16421
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Anastasia Braginsky
>         Attachments: CellChunkMapRevived.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Follow up for HBASE-14921. This is going to be the umbrella JIRA to include 
> all the parts of integration of the CellChunkMap to the MemStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore

Reply via email to