[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171229#comment-15171229
 ] 

Anastasia Braginsky commented on HBASE-14921:
---------------------------------------------

I apologize for not explaining it well. 
In a try to clarify myself I wrote the attached paper. Not so long and with 
pictures :)
Maybe I am missing something, so please show me where my understanding is wrong.
I am answering here shortly, but please, please, please take also a look on the 
attached document.

bq. What will the serialization/format-transform look like (if any)?

I think there is no format-transform between CellBlocksSegment and HFile (if I 
understand you correctly).
Flushing Snapshot to disk is done exactly the same as previously. Writing data 
from scanner to sink (HFile via StoreFile).
But please look on “How CellBlocksSegment is transfered to HFile?" in the 
document.

bq. After that the Cell object is created and the reference to this Cell is 
inserted into the skip-list to accelerate the search.
bq. Yes. This is a copy. Would be good if we did not have to do this.

Pay attention that CellBlocksSegments are created as result of the compaction 
process. This is how we compact: we take a mix of “obsolete" cells and 
“updated” cells and copy to another place the “updated” cells only. Then the 
memory holding the mix can be released. Please look on “Why copies are needed 
in compacting process?” in the document.

bq. You've seen how we store blocks to hfiles with index blocks and blooms?

Yes. Maybe I am missing something, but it looks to me that this variant is not 
the best. When using single-level index you lose the logarithmic access and 
when using multiple-level index you get the logarithmic access but pay in 
memory overhead. This is also explained in the document.



> Memory optimizations
> --------------------
>
>                 Key: HBASE-14921
>                 URL: https://issues.apache.org/jira/browse/HBASE-14921
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>         Attachments: CellBlocksSegmentInMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to