[ 
https://issues.apache.org/jira/browse/HBASE-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107757#comment-15107757
 ] 

Hudson commented on HBASE-14058:
--------------------------------

SUCCESS: Integrated in HBase-1.2-IT #401 (See 
[https://builds.apache.org/job/HBase-1.2-IT/401/])
HBASE-14058 Stabilizing default heap memory tuner (eclark: rev 
e738e69f8cc59581a454207483aca42e7f314396)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultHeapMemoryTuner.java


> Stabilizing default heap memory tuner
> -------------------------------------
>
>                 Key: HBASE-14058
>                 URL: https://issues.apache.org/jira/browse/HBASE-14058
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 2.0.0, 1.2.0, 1.3.0
>            Reporter: Abhilash
>            Assignee: Abhilash
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: 0001-Stabilizing-default-heap-memory-tuner.patch, 
> HBASE-14058-v1.patch, HBASE-14058.patch, after_modifications.png, 
> before_modifications.png
>
>
> The memory tuner works well in general cases but when we have a work load 
> that is both read heavy as well as write heavy the tuner does too many 
> tuning. We should try to control the number of tuner operation and stabilize 
> it. The main problem was that the tuner thinks it is in steady state even if 
> it sees just one neutral tuner period thus does too many tuning operations 
> and too many reverts that too with large step sizes(step size was set to 
> maximum even after one neutral period). So to stop this I have thought of 
> these steps:
> 1) The division created by μ + δ/2 and μ - δ/2 is too small. Statistically 
> ~62% periods will lie outside this range, which means 62% of the data points 
> are considered either high or low which is too much. Use μ + δ*0.8 and μ - 
> δ*0.8 instead. On expectations it will decrease number of tuner operations 
> per 100 periods from 19 to just 10. If we use δ/2 then 31% of data values 
> will be considered to be high and 31% will be considered to be low (2*0.31 * 
> 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% 
> will be high(2*0.22*0.22 ~ 0.10).
> 2) Defining proper steady state by looking at past few periods(it is equal to 
> hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than just last 
> tuner operation. We say tuner is in steady state when last few tuner periods 
> were NEUTRAL. We keep decreasing step size unless it is extremely low. Then 
> leave system in that state for some time.
> 3) Rather then decreasing step size only while reverting, decrease the 
> magnitude of step size whenever we are trying to revert tuning done in last 
> few periods(sum the changes of last few periods and compare to current step) 
> rather than just looking at last period. When its magnitude gets too low then 
> make tuner steps NEUTRAL(no operation). This will cause step size to 
> continuously decrease unless we reach steady state. After that tuning process 
> will restart (tuner step size rests again when we reach steady state).
> 4) The tuning done in last few periods will be decaying sum of past tuner 
> steps with sign. This parameter will be positive for increase in memstore and 
> negative for increase in block cache. Rather than using arithmetic mean we 
> use this to give more priority to recent tuner steps.
> Please see the attachments. One represents the size of memstore(green) and 
> size of block cache(blue) adjusted by tuner without these modification and 
> other with the above modifications. The x-axis is time axis and y-axis is the 
> fraction of heap memory available to memstore and block cache at that time(it 
> always sums up to 80%). I configured min/max ranges for both components to 
> 0.1 and 0.7 respectively(so in the plots the y-axis min and max is 0.1 and 
> 0.7). In both cases the tuner tries to distribute memory by giving ~15% to 
> memstore and ~65% to block cache. But the modified one does it much more 
> smoothly.
> I got these results from YCSB test. The test was doing approximately 5000 
> inserts and 500 reads per second (for one region server). The results can be 
> further fine tuned and number of tuner operation can be reduced with these 
> changes in configuration.
> For more fine tuning:
> a) lower max step size (suggested = 4%)
> b) lower min step size ( default if also fine )
> To further decrease frequency of tuning operations:
> c) increase the number of lookup periods ( in the tests it was just 10, 
> default is 60 )
> d) increase tuner period ( in the tests it was just 20 secs, default is 
> 60secs)
> I used smaller tuner period/ number of look up periods to get more data 
> points.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to