[ 
https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhilash updated HBASE-14058:
-----------------------------
    Attachment: HBASE-14058.patch

> Stabilizing default heap memory tuner
> -------------------------------------
>
>                 Key: HBASE-14058
>                 URL: https://issues.apache.org/jira/browse/HBASE-14058
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Abhilash
>            Assignee: Abhilash
>         Attachments: HBASE-14058.patch, after_modifications.png, 
> before_modifications.png
>
>
> The memory tuner works well in general cases but when we have a work load 
> that is both read heavy as well as write heavy the tuner does too many 
> tuning. We should try to control the number of tuner operation and stabilize 
> it. The main problem was that the tuner thinks it is in steady state even if 
> it sees just one neutral tuner period thus does too many tuning operations 
> and too many reverts that too with large step sizes(step size was set to 
> maximum even after one neutral period). So to stop this I have thought of 
> these steps:
> 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically 
> ~62% periods will lie outside this range, which means 62% of the data points 
> are considered either high or low which is too much. Use μ + δ*0.8 and μ - 
> δ*0.8 instead. On expectations it will decrease number of tuner operations 
> per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values 
> will be considered to be high and 31% will be considered to be low (2*0.31 * 
> 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% 
> will be high(2*0.22*0.22 ~ 0.10).
> 2) Defining proper steady state by looking at past few periods(it is equal to 
> hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last 
> tuner operation. We say tuner is in steady state when last few tuner periods 
> were NEUTRAL. We keep decreasing step size unless it is extremely low. Then 
> leave system in that state for some time.
> 3) Rather then decreasing step size only while reverting, decrease the 
> magnitude of step size whenever we are trying to revert tuning done in last 
> few periods(sum the changes of last few periods and compare to current step) 
> rather than just looking at last period. When its magnitude gets too low then 
> make tuner steps NEUTRAL(no operation). This will cause step size to 
> continuously decrease unless we reach steady state. After that tuning process 
> will restart (tuner step size rests again when we reach steady state).
> 4) The tuning done in last few periods will be decaying sum of past tuner 
> steps with sign. This parameter will be positive for increase in memstore and 
> negative for increase in block cache. Rather than using arithmetic mean we 
> use this to give more priority to recent tuner steps.
> Please see the attachments. One represents the size of memstore(green) and 
> size of block cache(blue) adjusted by tuner without these modification and 
> other with the above modifications. I got these results from YCSB test. The 
> test was doing approximately 5000 inserts and 500 reads per second (for one 
> region server). The results can be further fine tuned and number of tuner 
> operation can be reduced with these changes in configuration.
> For more fine tuning:
> a) lower max step size (suggested = 4%)
> b) lower min step size ( default if also fine )
> To further decrease frequency of tuning operations:
> c) increase the number of lookup periods ( in the tests it was just 10, 
> default is 60 )
> d) increase tuner period ( in the tests it was just 20 secs, default is 
> 60secs)
> I used smaller tuner period/ number of look up periods to get more data 
> points.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to