[
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172035#comment-15172035
]
Anastasia Braginsky commented on HBASE-14921:
---------------------------------------------
[~stack] and [~anoop.hbase] big thank you for your great comments! Please see
below.
bq. On #2, you have heard the rumor that MSLAB may not be needed when running
on G1GC (TBD)
What makes you think that using G1GC is better than MSLAB? In my understanding
G1GC indeed decreases the GC pauses, but it does so using parallel programming
and more complex algorithms. So you are going to pay in CPU cycles and in
memory. The ad hoc memory management is always at least as good as some
universal one. So I believe MSLAB still need to be used (and not only because
of off-heap option) even if G1GC is used.
bq. So, when you say the MSLAB can be offheap, its ok to have references only
in CSLM? We do not want to be copying data across the onheap/offheap boundary
if it can be avoided.
When MSLAB goes off-heap there is no copying data across the onheap/offheap!
Only at the beginning if data comes on-heap and need to be copied down to MSLAB
off-heap Chunks. Then at the end when flushing to disk, (as I see it) the HFile
Writer still uses on-heap byte stream. So no option, but to copy back from
off-heap to on-heap.
And about having references only in CSLM, what do you mean? No need in
CellBlocks? Or do you want the entire Cell object to be pushed inside
ConcurrentSkipListMap? Pay attention that references between off-heap and
on-heap are OK (no extra performance cost), just those accesses are going to be
performed differently.
bq. So, it looks like you are talking of doing at least an extra copy from the
original MSLAB to a new Segment MSLAB. Would be cool if a bit of evidence that
this extra copy to a Segment, even in the worst case where no purge was
possible, cost less than trying to continue with a 'fat' CSLM.
You are totally right, it could be good to have some “compaction predictor”,
which will indicate how much a compaction is needed. We have some thoughts how
it can be done, but it is not a trivial task. In order not to intermix it all
together just now, we can add such a predictor later, after we have
benchmarking for flat representation and off-heaping. As you can see there is a
lot to be done, let us just take the challenges one by one.
bq. "The compaction process can repeat until we must flush to disk. " There
will be guards in place to prevent our compacting in-memory when it not
possible that a compaction can produce a tighter in-memory representation (no
purges possible, etc.)?
Currently we do not have such “guards” and I understand your concern for
unneeded or frequent compaction. For now compaction starts (asynchronously)
soon after flush-in-memory and we assume flush-in-memory is infrequent task
that “freezes” (makes immutable) a big amount of memory. So assumption is that
among big amount of memory you have a higher probability to find something to
compact.
bq. When will the compaction getting triggered? Time based and/or
#ImmutableSegments in the pipeline?
bq. So am very interested to know when you consider we can compact the CSLM
cells into array.
As I have said, currently the compaction is triggered asynchronously after each
in-memory-flush, if there is no another on-going compaction. #ImmutableSegments
in the pipeline can also be a trigger. Please pay attention that compaction
process happens in background (!!!) meaning that none waits for it. It cost you
CPU cycles only, and if you lower the priority of the compacting thread even
the CPU cycles should not be an issue. So I wouldn’t be worried so much about
the time spent on copy in the compaction time… Am I missing something?
bq. So for making the array of Cells we need to know how many cells will
survive into the compacted result. So we will do scan over the
ImmutableSegments 2 times? To know the #cells and then for actual moving it
into array.
No. We are going to allocate the array of Cells for the worst case - all the
cells will survive. Pay attention that Cell reference takes very little.
bq. If we know the #cells compacting out and #cells which will get away, we can
decide whether it is worth copy to new area or not.
This is also a possibility.
bq. It is not just 8 bytes extra overhead per cell when we have array of cells
instead of plain bytes cellblock (as HFile data block)
bq. Ref to cell in array (8 bytes) + Cell Object (16 bytes) + ref to byte with
Cell (8) + offset and length ints (8) = 40 bytes per cell.
OK. Pay attention that when you have plain bytes cellblock (as HFile data
block), in CellBlock as in HBASE-10713 you had a TreeMap overhead on top of
plain bytes for the search. So if we are not counting Cell data in the MSLAB
and if we have 40 bytes overhead per cell it is still good. In CLSM you have
4x40 = 160 bytes overhead per Cell (again not counting Cell data in MSLAB which
can be 1KB).
> Memory optimizations
> --------------------
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 2.0.0
> Reporter: Eshcar Hillel
> Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap
> allocations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)