[ 
https://issues.apache.org/jira/browse/HBASE-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410934#comment-13410934
 ] 

Matt Corgan commented on HBASE-3484:
------------------------------------

I've been pondering how to better compact the data in the memstore.  Sometimes 
we see a 100MB memstore flush that is really 10MB of KeyValues, which gzips to 
like 2MB, meaning there is a ton of pointer overhead.

One thing that came to mind was splitting each memstore into "regions" of 
consecutive cell ranges and fronting these regions with an index of some sort.  
Instead of Set<KeyValue> the memstore is Set<Set<KeyValue>>.  When an internal 
region crosses a certain size we split it in half.  With a good index structure 
in front of the memstore blocks, it might get closer to a linear 
performance/size curve.  It's comparable with hbase splitting a table into 
regions.

Then, to address the pointer overhead problem, you could use DataBlockEncoding 
to encode each memstore region individually.  A memstore region could 
accumulate several blocks that get compacted periodically.  Given a region size 
of ~64-256KB, the compaction could be very aggressive and could even be done by 
the thread writing the data.  Again, very similar to how hbase manages the 
internals of a single region.

This adds moving pieces and complexity but could be developed as a pluggable 
module that passes the same unit tests as the current memstore.
                
> Replace memstore's ConcurrentSkipListMap with our own implementation
> --------------------------------------------------------------------
>
>                 Key: HBASE-3484
>                 URL: https://issues.apache.org/jira/browse/HBASE-3484
>             Project: HBase
>          Issue Type: Improvement
>          Components: performance
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>         Attachments: hierarchical-map.txt, memstore_drag.png
>
>
> By copy-pasting ConcurrentSkipListMap into HBase we can make two improvements 
> to it for our use case in MemStore:
> - add an iterator.replace() method which should allow us to do upsert much 
> more cheaply
> - implement a Set directly without having to do Map<KeyValue,KeyValue> to 
> save one reference per entry
> It turns out CSLM is in public domain from its development as part of JSR 
> 166, so we should be OK with licenses.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to