Abhilash created HBASE-14058:
--------------------------------

             Summary: Stabilize heap memory tuner
                 Key: HBASE-14058
                 URL: https://issues.apache.org/jira/browse/HBASE-14058
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: Abhilash
            Assignee: Abhilash


The memory tuner works well in general cases but when we have a work load that 
is both read heavy as well as write heavy the tuner does too many tuning. We 
should try to control the number of tuner operation and stabilize it. The main 
problem was that the tuner thinks it is in steady state even if it sees just 
one neutral tuner period thus does too many tuning operations and too many 
reverts that too with large step sizes(step size was set to maximum even after 
one neutral period). So to stop this I have thought of these steps:

1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically ~62% 
periods will lie outside this range, which means 62% of the data points are 
considered either high or low which is too much. Use μ + δ*0.8 and μ - δ*0.8 
instead. On expectations it will decrease number of tuner operations per 100 
periods from 19 to just 10. If we use δ/2 then 31% of data values will be 
considered to be high and 31% will be considered to be low (2*0.31 * 0.31 = 
0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be 
high(2*0.22*0.22 ~ 0.10).

2) Defining proper steady state by looking at past few periods(it is equal to 
hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last 
tuner operation. We say tuner is in steady state when last few tuner periods 
were NEUTRAL. We keep decreasing step size unless it is extremely low. Then 
leave system in that state for some time.

3) Rather then decreasing step size only while reverting, decrease the 
magnitude of step size whenever we are trying to revert tuning done in last few 
periods(sum the changes of last few periods and compare to current step) rather 
than just looking at last period. When its magnitude gets too low then make 
tuner steps NEUTRAL(no operation). This will cause step size to continuously 
decrease unless we reach steady state. After that tuning process will restart 
(tuner step size rests again when we reach steady state).

4) The tuning done in last few periods will be decaying sum of past tuner steps 
with sign. This parameter will be positive for increase in memstore and 
negative for increase in block cache. Rather than using arithmetic mean we use 
this to give more priority to recent tuner steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to