[ 
https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647297#comment-14647297
 ] 

Eshcar Hillel commented on HBASE-13408:
---------------------------------------

Thank you [~Apache9] and [~anoop.hbase] for your comments.

There is a question of when to push the active set into the pipeline, and which 
threshold to use. This should be some configurable parameter. But please let’s 
put this aside for a minute.
The problem I meant to handle with the WAL truncation mechanism is orthogonal 
to this decision. Consider a region with one compacting store. Assume we add 
the following key-value-ts tuples to the memstore:
(A,1,1) (A,4,4) (A,7,7)
(B,2,2) (B,5,5) (B,8,8)
(C,3,3) (C,6,6) (C,9,9)
All these items will have edits in the WAL. After compaction what is left 
in-memory are
(A,7,7) (B,8,8) (C,9,9)
however these edits are not removed from the WAL since no flushing occurs.
This can go on and on without ever flushing data to disk and without removing 
WAL edits.
The solution we suggested earlier is to have a small map that would help 
determine that after the compaction in the example above we can remove all WAL 
entries that correspond to ts equal or lower than 6. And it happens not within 
the scope of a flush as compaction is a background process. 
If we don’t change the WAL truncation in this way WAL can grow without limit.

Supporting a more compacted format in the compaction pipeline was discussed 
when we just started this JIRA. The design we suggested enables plugging-in any 
data structure: it can be the CellBlocks by [~anoop.hbase], it can be a b-tree, 
or any alternative that is suggested in HBASE-3993. It only needs to support 
the API defined by the CellSkipListSet wrapper class (in our patch we changed 
its name to CellSet to indicate the implementation is not restricted to a 
skip-list).
Having said that, we would like to keep the initial solution simple. The 
plug-in infrastructure is in; experimenting with different data structures can 
be allocated a different task.

Coming back to the timing of the in-memory flush, since this action mandates 
the same synchronization as in a flush to disk (to block the updaters while 
allocating a new active set) it seems appropriate to apply it upon a disk 
flush. 
Moreover, if we don’t change the flush semantics a compacting memstore can be 
forced to flush to disk when it reaches 16M (I can show an example) which would 
countervail the benefits of this feature.

> HBase In-Memory Memstore Compaction
> -----------------------------------
>
>                 Key: HBASE-13408
>                 URL: https://issues.apache.org/jira/browse/HBASE-13408
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Eshcar Hillel
>         Attachments: 
> HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, 
> HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, 
> InMemoryMemstoreCompactionEvaluationResults.pdf
>
>
> A store unit holds a column family in a region, where the memstore is its 
> in-memory component. The memstore absorbs all updates to the store; from time 
> to time these updates are flushed to a file on disk, where they are 
> compacted. Unlike disk components, the memstore is not compacted until it is 
> written to the filesystem and optionally to block-cache. This may result in 
> underutilization of the memory due to duplicate entries per row, for example, 
> when hot data is continuously updated. 
> Generally, the faster the data is accumulated in memory, more flushes are 
> triggered, the data sinks to disk more frequently, slowing down retrieval of 
> data, even if very recent.
> In high-churn workloads, compacting the memstore can help maintain the data 
> in memory, and thereby speed up data retrieval. 
> We suggest a new compacted memstore with the following principles:
> 1.    The data is kept in memory for as long as possible
> 2.    Memstore data is either compacted or in process of being compacted 
> 3.    Allow a panic mode, which may interrupt an in-progress compaction and 
> force a flush of part of the memstore.
> We suggest applying this optimization only to in-memory column families.
> A design document is attached.
> This feature was previously discussed in HBASE-5311.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to