[ 
https://issues.apache.org/jira/browse/HBASE-25973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382370#comment-17382370
 ] 

Michael Stack commented on HBASE-25973:
---------------------------------------

Merged to branch-2.3 and branch-2.4. Waiting on branch-2 backport to finish 
test before can close this.

> Balancer should explain progress in a better way in log
> -------------------------------------------------------
>
>                 Key: HBASE-25973
>                 URL: https://issues.apache.org/jira/browse/HBASE-25973
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>            Priority: Major
>             Fix For: 2.3.6, 3.0.0-alpha-2, 2.4.5
>
>
> In the log, balancer logs at info level at the beginning of run:
>  {code}
> balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer, 
> initCost=277.3479243125063, functionCost=RegionCountSkewCostFunction : 
> (500.0, 0.3749771215224234); ServerLocalityCostFunction : (25.0, 
> 0.5807483226644186); RackLocalityCostFunction : (15.0, 0.0); 
> TableSkewCostFunction : (1000.0, 0.0019704142954972883); 
> StoreFileCostFunction : (200.0, 0.3668512059459341);  computedMaxSteps: 
> 42270438200
> {code}
> the cost is reported without context, it is hard for operator to understand 
> how unbalanced the cluster is for balancer and how much progress we are 
> making.
> For a large cluster, the calculation can take a long time, we also need to 
> let operator understand that it will take up to the max time to complete the 
> calculation. 
> At the end of computation:
> {code}
> balancer.StochasticLoadBalancer: Finished computing new load balance plan. 
> Computation took PT40M0.006S to try 1036409 different iterations. Found a 
> solution that moves 161926 regions; Going from a computed cost of 
> 118.75715593924485 to a new cost of 1.5509126920967042
> {code}
> The time to compute the plan is also printed in a  format that is not human 
> readable. we also need to let operator understand that balancer is just 
> submitting the plan and it be up to execution to complete the move.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to