[
https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107757#comment-15107757
]
Hudson commented on HBASE-14058:
--------------------------------
SUCCESS: Integrated in HBase-1.2-IT #401 (See
[https://builds.apache.org/job/HBase-1.2-IT/401/])
HBASE-14058 Stabilizing default heap memory tuner (eclark: rev
e738e69f8cc59581a454207483aca42e7f314396)
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultHeapMemoryTuner.java
> Stabilizing default heap memory tuner
> -------------------------------------
>
> Key: HBASE-14058
> URL: https://issues.apache.org/jira/browse/HBASE-14058
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 2.0.0, 1.2.0, 1.3.0
> Reporter: Abhilash
> Assignee: Abhilash
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-Stabilizing-default-heap-memory-tuner.patch,
> HBASE-14058-v1.patch, HBASE-14058.patch, after_modifications.png,
> before_modifications.png
>
>
> The memory tuner works well in general cases but when we have a work load
> that is both read heavy as well as write heavy the tuner does too many
> tuning. We should try to control the number of tuner operation and stabilize
> it. The main problem was that the tuner thinks it is in steady state even if
> it sees just one neutral tuner period thus does too many tuning operations
> and too many reverts that too with large step sizes(step size was set to
> maximum even after one neutral period). So to stop this I have thought of
> these steps:
> 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically
> ~62% periods will lie outside this range, which means 62% of the data points
> are considered either high or low which is too much. Use μ + δ*0.8 and μ -
> δ*0.8 instead. On expectations it will decrease number of tuner operations
> per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values
> will be considered to be high and 31% will be considered to be low (2*0.31 *
> 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22%
> will be high(2*0.22*0.22 ~ 0.10).
> 2) Defining proper steady state by looking at past few periods(it is equal to
> hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last
> tuner operation. We say tuner is in steady state when last few tuner periods
> were NEUTRAL. We keep decreasing step size unless it is extremely low. Then
> leave system in that state for some time.
> 3) Rather then decreasing step size only while reverting, decrease the
> magnitude of step size whenever we are trying to revert tuning done in last
> few periods(sum the changes of last few periods and compare to current step)
> rather than just looking at last period. When its magnitude gets too low then
> make tuner steps NEUTRAL(no operation). This will cause step size to
> continuously decrease unless we reach steady state. After that tuning process
> will restart (tuner step size rests again when we reach steady state).
> 4) The tuning done in last few periods will be decaying sum of past tuner
> steps with sign. This parameter will be positive for increase in memstore and
> negative for increase in block cache. Rather than using arithmetic mean we
> use this to give more priority to recent tuner steps.
> Please see the attachments. One represents the size of memstore(green) and
> size of block cache(blue) adjusted by tuner without these modification and
> other with the above modifications. The x-axis is time axis and y-axis is the
> fraction of heap memory available to memstore and block cache at that time(it
> always sums up to 80%). I configured min/max ranges for both components to
> 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and
> 0.7). In both cases the tuner tries to distribute memory by giving ~15% to
> memstore and ~65% to block cache. But the modified one does it much more
> smoothly.
> I got these results from YCSB test. The test was doing approximately 5000
> inserts and 500 reads per second (for one region server). The results can be
> further fine tuned and number of tuner operation can be reduced with these
> changes in configuration.
> For more fine tuning:
> a) lower max step size (suggested = 4%)
> b) lower min step size ( default if also fine )
> To further decrease frequency of tuning operations:
> c) increase the number of lookup periods ( in the tests it was just 10,
> default is 60 )
> d) increase tuner period ( in the tests it was just 20 secs, default is
> 60secs)
> I used smaller tuner period/ number of look up periods to get more data
> points.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)