[
https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624518#comment-14624518
]
Hadoop QA commented on HBASE-14058:
-----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12744997/HBASE-14058-v1.patch
against master branch at commit 5e708746b8d301c2fb22a85b8756129147012374.
ATTACHMENT ID: 12744997
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.
{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 protoc{color}. The applied patch does not increase the
total number of protoc compiler warnings.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.
{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:green}+1 site{color}. The mvn post-site goal succeeds with this patch.
{color:red}-1 core tests{color}. The patch failed these unit tests:
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/14754//testReport/
Release Findbugs (version 2.0.3) warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/14754//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/14754//artifact/patchprocess/checkstyle-aggregate.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/14754//console
This message is automatically generated.
> Stabilizing default heap memory tuner
> -------------------------------------
>
> Key: HBASE-14058
> URL: https://issues.apache.org/jira/browse/HBASE-14058
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 2.0.0, 1.2.0, 1.3.0
> Reporter: Abhilash
> Assignee: Abhilash
> Attachments: HBASE-14058-v1.patch, HBASE-14058.patch,
> after_modifications.png, before_modifications.png
>
>
> The memory tuner works well in general cases but when we have a work load
> that is both read heavy as well as write heavy the tuner does too many
> tuning. We should try to control the number of tuner operation and stabilize
> it. The main problem was that the tuner thinks it is in steady state even if
> it sees just one neutral tuner period thus does too many tuning operations
> and too many reverts that too with large step sizes(step size was set to
> maximum even after one neutral period). So to stop this I have thought of
> these steps:
> 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically
> ~62% periods will lie outside this range, which means 62% of the data points
> are considered either high or low which is too much. Use μ + δ*0.8 and μ -
> δ*0.8 instead. On expectations it will decrease number of tuner operations
> per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values
> will be considered to be high and 31% will be considered to be low (2*0.31 *
> 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22%
> will be high(2*0.22*0.22 ~ 0.10).
> 2) Defining proper steady state by looking at past few periods(it is equal to
> hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last
> tuner operation. We say tuner is in steady state when last few tuner periods
> were NEUTRAL. We keep decreasing step size unless it is extremely low. Then
> leave system in that state for some time.
> 3) Rather then decreasing step size only while reverting, decrease the
> magnitude of step size whenever we are trying to revert tuning done in last
> few periods(sum the changes of last few periods and compare to current step)
> rather than just looking at last period. When its magnitude gets too low then
> make tuner steps NEUTRAL(no operation). This will cause step size to
> continuously decrease unless we reach steady state. After that tuning process
> will restart (tuner step size rests again when we reach steady state).
> 4) The tuning done in last few periods will be decaying sum of past tuner
> steps with sign. This parameter will be positive for increase in memstore and
> negative for increase in block cache. Rather than using arithmetic mean we
> use this to give more priority to recent tuner steps.
> Please see the attachments. One represents the size of memstore(green) and
> size of block cache(blue) adjusted by tuner without these modification and
> other with the above modifications. The x-axis is time axis and y-axis is the
> fraction of heap memory available to memstore and block cache at that time(it
> always sums up to 80%). I configured min/max ranges for both components to
> 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and
> 0.7). In both cases the tuner tries to distribute memory by giving ~15% to
> memstore and ~65% to block cache. But the modified one does it much more
> smoothly.
> I got these results from YCSB test. The test was doing approximately 5000
> inserts and 500 reads per second (for one region server). The results can be
> further fine tuned and number of tuner operation can be reduced with these
> changes in configuration.
> For more fine tuning:
> a) lower max step size (suggested = 4%)
> b) lower min step size ( default if also fine )
> To further decrease frequency of tuning operations:
> c) increase the number of lookup periods ( in the tests it was just 10,
> default is 60 )
> d) increase tuner period ( in the tests it was just 20 secs, default is
> 60secs)
> I used smaller tuner period/ number of look up periods to get more data
> points.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)