Eshcar Hillel created HBASE-14918:
-------------------------------------

             Summary: In-Memory MemStore Flush and Compaction
                 Key: HBASE-14918
                 URL: https://issues.apache.org/jira/browse/HBASE-14918
             Project: HBase
          Issue Type: Umbrella
    Affects Versions: 2.0.0
            Reporter: Eshcar Hillel


A memstore serves as the in-memory component of a store unit, absorbing all 
updates to the store. From time to time these updates are flushed to a file on 
disk, where they are compacted (by eliminating redundancies) and compressed 
(i.e., written in a compressed format to reduce their storage size).

We aim to speed up data access, and therefore suggest to apply in-memory 
memstore flush. That is to flush the active in-memory segment into an 
intermediate buffer where it can be accessed by the application. Data in the 
buffer is subject to compaction and can be stored in any format that allows it 
to take up smaller space in RAM. The less space the buffer consumes the longer 
it can reside in memory before data is flushed to disk, resulting in better 
performance.
Specifically, the optimization is beneficial for workloads with medium-to-high 
key churn which incur many redundant cells, like persistent messaging. 

We suggest to structure the solution as 3 subtasks (respectively, patches). 
(1) Infrastructure - refactoring of the MemStore hierarchy, introducing segment 
(StoreSegment) as first-class citizen, and decoupling memstore scanner from the 
memstore implementation;
(2) Implementation of a new memstore (CompactingMemstore) with non-optimized 
immutable segment representation, and 
(3) Memory optimization including compressed format representation and offheap 
allocations.

This Jira continues the discussion in HBASE-13408.
Design documents, evaluation results and previous patches can be found in 
HBASE-13408. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to