[
https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422670#comment-15422670
]
Anastasia Braginsky commented on HBASE-16421:
---------------------------------------------
The summary of the previous steps:
In HBASE-14920 the new variation of a MemStore (called CompactingMemStore) was
introduced. In addition, HBASE-14920 presents partitioning of the in-memory
content into segments that can be mutable and immutable. Periodically, the
CompactingMemStore flushes the content of the mutable active segment into
immutable segment. Immutable segments are kept in memory in compacting
pipeline, where they are compacted (i.e. merged together with elimination of
the duplicated cells).
In HBASE-14921 the new concept of flattening segments in the pipeline was
introduced. Flat implementation of the immutable segment's index (denoted
CellArrayMap) comes as alternative to ConcurrentSkipListMap. CellArrayMap is
implemented as an ordered array, on top of which binary search is used to find
the cell. CellArrayMap significantly reduces the memory foot print of the
segment's index (compared to ConcurrentSkipListMap). Starting HBASE-14921, the
immutable segments in the compaction pipeline can either be compacted or
flatten (i.e. transform the index from ConcurrentSkipListMap to CellArrayMap
without compaction).
This JIRA should hold all the changes required to present yet another variant
for the immutable segment's index (denoted CellChunkMap) mostly suitable for
off-heaping. CellChunkMap is a byte array, where each cell reference is
represented with up to 12 bytes. Also binary search is used to search through
CellChunkMap. Each cell is represented with (1) chunk id - the reference to the
chunk of memory with the data of the cell; (2) offset - from the start of the
chunk; (3) length - of the cell's data. The CellChunkMap uses even less bytes
per cell (compared to CellArrayMap) and is also the only one suitable for the
off-heaping, due to naturally being serialized. The CellChunkMap can serve as
an index only to the cells allocated on chunks (from MemStoreLAB).
For now we see the following candidates for the sub-JIRAs:
-- The CellChunkMap implementation itself (already prototyped but not
integrated yet)
-- Related design issues (some refactoring of MemStoreChunkPool, MSLAB and
HeapMSLAB)
-- Flattening to CellChunkMap (integrating with new Anoop Sam John and
ramkrishna.s.vasudevan code)
-- The Big Cells issue (cells that are bigger then the chunk size)
> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>
> Key: HBASE-16421
> URL: https://issues.apache.org/jira/browse/HBASE-16421
> Project: HBase
> Issue Type: Umbrella
> Reporter: Anastasia Braginsky
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)