[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction

Anastasia Braginsky (JIRA) Mon, 22 Feb 2016 02:45:57 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156756#comment-15156756
 ]


Anastasia Braginsky commented on HBASE-14918:
---------------------------------------------

Hi,

We are now starting to progress with the CellBlocksSegment implementation for 
the CompactedMemStore.
As explained in the attached design, CellBlocksSegment is a flat layout for the 
immutable segments (those which were flushed-in-memory).
The suggestion is to implement CellBlocksSegment as long ordered array and to 
use binary search for the navigation inside the array.
The array is the data structure that suits us best, because (1) the data is 
immutable (no insertions/deletions), (2) the data is already ordered before 
being written to array, (3) minimal memory overhead for any pointers, (4) most 
easily serializable.
The HFile instance (with all its great possibilities) seams like something too 
complex for this simple task of storing the data flat.

Clearly, a long array need to be partitioned into sub-array with bounded size. 
So down to earth, we will have an array of arrays.
After once again looking on MSLAB and the memory management around, it looks 
like those arrays can be the chunks from MSLAB.
So in an elegant way all memory allocations remains through MSLAB.

Even more than that, it appears that MSLAB may be arranged to allocate chunks 
off-heap (with some little adjustment, of course).
This can be used later if needed. Another discussion is required to understand 
the off-heap possibilities in MemStore.

[~anoop.hbase] and everybody, what do you think?

Thanks,
Anastasia

> In-Memory MemStore Flush and Compaction
> ---------------------------------------
>
>                 Key: HBASE-14918
>                 URL: https://issues.apache.org/jira/browse/HBASE-14918
>             Project: HBase
>          Issue Type: Umbrella
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>             Fix For: 0.98.18
>
>         Attachments: CellBlocksSegmentDesign.pdf, MSLABMove.patch
>
>
> A memstore serves as the in-memory component of a store unit, absorbing all 
> updates to the store. From time to time these updates are flushed to a file 
> on disk, where they are compacted (by eliminating redundancies) and 
> compressed (i.e., written in a compressed format to reduce their storage 
> size).
> We aim to speed up data access, and therefore suggest to apply in-memory 
> memstore flush. That is to flush the active in-memory segment into an 
> intermediate buffer where it can be accessed by the application. Data in the 
> buffer is subject to compaction and can be stored in any format that allows 
> it to take up smaller space in RAM. The less space the buffer consumes the 
> longer it can reside in memory before data is flushed to disk, resulting in 
> better performance.
> Specifically, the optimization is beneficial for workloads with 
> medium-to-high key churn which incur many redundant cells, like persistent 
> messaging. 
> We suggest to structure the solution as 4 subtasks (respectively, patches). 
> (1) Infrastructure - refactoring of the MemStore hierarchy, introducing 
> segment (StoreSegment) as first-class citizen, and decoupling memstore 
> scanner from the memstore implementation;
> (2) Adding StoreServices facility at the region level to allow memstores 
> update region counters and access region level synchronization mechanism;
> (3) Implementation of a new memstore (CompactingMemstore) with non-optimized 
> immutable segment representation, and 
> (4) Memory optimization including compressed format representation and off 
> heap allocations.
> This Jira continues the discussion in HBASE-13408.
> Design documents, evaluation results and previous patches can be found in 
> HBASE-13408. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction

Reply via email to