[
https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anoop Sam John updated HBASE-13408:
-----------------------------------
Resolution: Duplicate
Assignee: (was: Eshcar Hillel)
Status: Resolved (was: Patch Available)
Dup of HBASE-14918.
> HBase In-Memory Memstore Compaction
> -----------------------------------
>
> Key: HBASE-13408
> URL: https://issues.apache.org/jira/browse/HBASE-13408
> Project: HBase
> Issue Type: New Feature
> Reporter: Eshcar Hillel
> Fix For: 2.0.0
>
> Attachments: HBASE-13408-trunk-v01.patch,
> HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch,
> HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch,
> HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch,
> HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch,
> HBASE-13408-trunk-v10.patch,
> HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf,
> HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf,
> HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf,
> HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf,
> InMemoryMemstoreCompactionEvaluationResults.pdf,
> InMemoryMemstoreCompactionMasterEvaluationResults.pdf,
> InMemoryMemstoreCompactionScansEvaluationResults.pdf,
> StoreSegmentandStoreSegmentScannerClassHierarchies.pdf
>
>
> A store unit holds a column family in a region, where the memstore is its
> in-memory component. The memstore absorbs all updates to the store; from time
> to time these updates are flushed to a file on disk, where they are
> compacted. Unlike disk components, the memstore is not compacted until it is
> written to the filesystem and optionally to block-cache. This may result in
> underutilization of the memory due to duplicate entries per row, for example,
> when hot data is continuously updated.
> Generally, the faster the data is accumulated in memory, more flushes are
> triggered, the data sinks to disk more frequently, slowing down retrieval of
> data, even if very recent.
> In high-churn workloads, compacting the memstore can help maintain the data
> in memory, and thereby speed up data retrieval.
> We suggest a new compacted memstore with the following principles:
> 1. The data is kept in memory for as long as possible
> 2. Memstore data is either compacted or in process of being compacted
> 3. Allow a panic mode, which may interrupt an in-progress compaction and
> force a flush of part of the memstore.
> We suggest applying this optimization only to in-memory column families.
> A design document is attached.
> This feature was previously discussed in HBASE-5311.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)