[
https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277864#comment-15277864
]
Anoop Sam John commented on HBASE-14920:
----------------------------------------
Did not go through the latest patch..
So it will work this way. The active is moved to pipeline at a threshold and it
is in memory flushed and this new immutable segment is there in pipeline now.
Later another active moved to pipeline and these 2 segments together get
compacted into one immutable segment and this continues. Correct? I was
thinking that once in flushed segment will not get further compacted. That was
the biggest worry point behind saying that we will flush less sized HFile. Yes
when we do in memory flush, we will make bigger sized HFiles now (compared to
current memstore) as in current memstore considerable part of the heap size is
overhead from Cell and CSLM node. What I was saying is if we flush whole
memstore (not just tail of pipeline) in the new implementation also, then
compared to we will make less sized HFile.
Will see how we handle the close situation.
On the multiple time compacting the already compacted segment, in case of many
cells can go ways because of version expiry/delete etc, that is good. We reduce
heap space with every compaction. But in a normal case, we might not reduce as
the cells might be moving out. So how we can balance here? Just asking.
Because the next time compact has to create scanner over the immutable segment,
read them and create Cells. It makes lot of garbage.
Another concern with not flushing whole memstore was that when the system is at
bounds of global memstore size, the selection of region for flush is decided by
the whole heap size of the memstore. Then we will select region with highest
heap size. (?) But actually it might not release that much memory. If another
memstore (normal old style) was there and that was selected, we would have
freed much more heap.
> Compacting Memstore
> -------------------
>
> Key: HBASE-14920
> URL: https://issues.apache.org/jira/browse/HBASE-14920
> Project: HBase
> Issue Type: Sub-task
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
> Attachments: HBASE-14920-V01.patch, HBASE-14920-V02.patch,
> HBASE-14920-V03.patch, HBASE-14920-V04.patch, HBASE-14920-V05.patch,
> HBASE-14920-V06.patch, HBASE-14920-V07.patch, HBASE-14920-V08.patch,
> move.to.junit4.patch
>
>
> Implementation of a new compacting memstore with non-optimized immutable
> segment representation
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)