[ 
https://issues.apache.org/jira/browse/HBASE-22300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824540#comment-16824540
 ] 

Biju Nair edited comment on HBASE-22300 at 4/23/19 9:15 PM:
------------------------------------------------------------

One obvious issue is with the time it takes for all the cost functions 
extending 
[CostFromRegionLoadFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1371].
 Each time when cost is calculated in each "step", it loops through all the 
regions in [all the servers in the 
cluster|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1398-L1410]
 i.e. number of server * regions per server * number of cost functions which is 
currently 5. A quick improvement will be to calculate the cost for each server 
during the init of the cost function and calculate the cost for the individual 
servers based on the action in a step i.e region move, region assign etc and 
use it to come-up with the over all cost.  


was (Author: gsbiju):
One obvious issue is with the time it takes for all the cost functions 
extending 
[CostFromRegionLoadFunction|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1371].
 Each time when cost is calculated in each step, it loops through all the 
regions in [all the servers in the 
cluster|https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L1398-L1410]
 i.e. number of server * regions per server * number of cost functions which is 
currently 5. A quick improvement will be to calculate the cost for each server 
during the init of the cost function and calculate the cost for the individual 
servers based on the action in a step i.e region move, region assign etc and 
use it to come-up with the over all cost.  

> SLB doesn't perform well with increase in number of regions
> -----------------------------------------------------------
>
>                 Key: HBASE-22300
>                 URL: https://issues.apache.org/jira/browse/HBASE-22300
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>            Reporter: Biju Nair
>            Priority: Major
>
> With increase in number of regions in a cluster the number of steps taken by 
> balancer in 30 sec (default balancer runtime) reduces noticeably. The 
> following is the number of steps taken with by balancer with region loads set 
> and running it without the loads being set i.e. cost functions using region 
> loads are not fully exercised.
> {noformat}
> Nodes  regions  Tables    # of steps           # of steps 
>                           with RS Load         With no load   
> 5       50       5        200000               200000
> 100     2000     110      104707               1000000                        
>   
> 100     10000    40       19911                1000000                        
>   
> 200     100000   400      870                  1000000                        
>   {noformat}
> As one would expect the reduced number of steps also makes the balancer take 
> long time to get to an optimal cost. Note that only 2 data points were used 
> in the region load histogram while in practice 15 region load data points are 
> remembered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to