Clara Xiong created HBASE-25973:
-----------------------------------
Summary: Balancer should explain progress in a better way in log
Key: HBASE-25973
URL: https://issues.apache.org/jira/browse/HBASE-25973
Project: HBase
Issue Type: Bug
Components: Balancer
Affects Versions: 3.0.0-alpha-1
Reporter: Clara Xiong
In the log, balancer logs at info level at the beginning of run:
balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer,
initCost=277.3479243125063, functionCost=RegionCountSkewCostFunction : (500.0,
0.3749771215224234); ServerLocalityCostFunction : (25.0, 0.5807483226644186);
RackLocalityCostFunction : (15.0, 0.0); TableSkewCostFunction : (1000.0,
0.0019704142954972883); StoreFileCostFunction : (200.0, 0.3668512059459341);
computedMaxSteps: 42270438200
the cost is reported without context, it is hard for operator to understand how
unbalanced the cluster is for balancer and how much progress we are making.
For a large cluster, the calculation can take a long time, we also need to let
operator understand that it will take up to the max time to complete the
calculation.
At the end of computation:
balancer.StochasticLoadBalancer: Finished computing new load balance plan.
Computation took PT40M0.006S to try 1036409 different iterations. Found a
solution that moves 161926 regions; Going from a computed cost of
118.75715593924485 to a new cost of 1.5509126920967042
The time to compute the plan is also printed in a format that is not human
readable. we also need to let operator understand that balancer is just
submitting the plan and it be up to execution to complete the move.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)