[
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305509#comment-16305509
]
Eshcar Hillel commented on HBASE-18294:
---------------------------------------
Thank you [~anoop.hbase] for explaining. I think I understand now what is going
on, however, I must say things are not at all simple :)
In different places we give different semantic to the same name. For example,
data size means in some places key-value size and in some places means off-heap
size; heap size sometimes means heap size and sometimes means metadata size.
This is very hard to follow and maintain.
My understanding is that for off heap clusters part is allocated on-heap and
part is allocated off-heap and there is no way around it.
As this is the state of the things, I strongly feel that we should go back to
the patch that introduced three counter in MemStoreSize data size, heap size,
and off-heap size.
At the region level we'll have MemStoreSize accounting (with 3 counters), and
have flush decisions based on comparing to (existing) on-heap and (new)
off-heap thresholds;
while at the RS level we'll have only off-heap accounting and on-heap
accounting to be compared against the two existing thresholds.
This may seem like a lot to change, however,
(1) the patch is practically ready (might need some small modifications and
adjustments due to recent code reviews)
(2) at the end of the day each counter has its own semantic that is highly
correlated with its name and is consistent across levels.
I think #2 is super important to avoid future bugs and facilitate maintenance
of the code for further adjustments.
Do you see any reasons to object to the change beside your view that it is more
complicated than the current trunk code?
> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch,
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch,
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch,
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch,
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch,
> HBASE-18294.13.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the
> store to another threshold (that can be configured with
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size
> (key-value only) to the threshold where it should compare the heap size
> (which includes index size, and metadata).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)