Clara Xiong created HBASE-25973:
-----------------------------------

             Summary: Balancer should explain progress in a better way in log
                 Key: HBASE-25973
                 URL: https://issues.apache.org/jira/browse/HBASE-25973
             Project: HBase
          Issue Type: Bug
          Components: Balancer
    Affects Versions: 3.0.0-alpha-1
            Reporter: Clara Xiong


In the log, balancer logs at info level at the beginning of run:

 
balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer, 
initCost=277.3479243125063, functionCost=RegionCountSkewCostFunction : (500.0, 
0.3749771215224234); ServerLocalityCostFunction : (25.0, 0.5807483226644186); 
RackLocalityCostFunction : (15.0, 0.0); TableSkewCostFunction : (1000.0, 
0.0019704142954972883); StoreFileCostFunction : (200.0, 0.3668512059459341);  
computedMaxSteps: 42270438200
the cost is reported without context, it is hard for operator to understand how 
unbalanced the cluster is for balancer and how much progress we are making.

For a large cluster, the calculation can take a long time, we also need to let 
operator understand that it will take up to the max time to complete the 
calculation. 

At the end of computation:

balancer.StochasticLoadBalancer: Finished computing new load balance plan. 
Computation took PT40M0.006S to try 1036409 different iterations. Found a 
solution that moves 161926 regions; Going from a computed cost of 
118.75715593924485 to a new cost of 1.5509126920967042

The time to compute the plan is also printed in a  format that is not human 
readable. we also need to let operator understand that balancer is just 
submitting the plan and it be up to execution to complete the move.  

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to