[
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264459#comment-16264459
]
Eshcar Hillel commented on HBASE-18294:
---------------------------------------
I am back with some numbers.
First, I noticed that master suffer major performance degradation w.r.t.
branch-2. This is out of the scope of this Jira and I plan to discuss this
issue separately.
Considering only the delta presented in the current patch here is what I
observe for write-only workload with default parameters (Basic memstore
compaction)
||code||Throughput||#flushes||#global heap pressure log lines||
|master|58-59Kops|~1250|~700|
|master+patch|70-71Kops|~2000|0|
And we see similar trends when running with no memstore compaction. We see that
looking at the heap size instead of data size causes more disk flushes, since
each store trigger flushes more frequently. However, the throughput increases
significantly as we *never* reach global heap pressure. IMO this demonstrates
that frequent pressure due to global heap size is not healthy, at least from
performance perspective.
These experiments show the benefit of the patch for on-heap stores. I think it
is best to enforce symmetric behavior for on-heap and off-heap stores. And this
should start with the naming convention. So let's not have data size vs.
on-heap size but rather on-heap vs off-heap size.
The reason I think we should have two (optional) threshold is that the space
allocated on- and off-heap and their usage can vary. Or let me phrase it as a
question: is there a reason not to let the admin the liberty to set these
threshold differently?? if they are not set by the admin they get the default
value (which is currently 128MB).
> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0
> Reporter: Eshcar Hillel
> Assignee: Eshcar Hillel
> Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch,
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch,
> HBASE-18294.06.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the
> store to another threshold (that can be configured with
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size
> (key-value only) to the threshold where it should compare the heap size
> (which includes index size, and metadata).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)