[
https://issues.apache.org/jira/browse/HBASE-25973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Stack updated HBASE-25973:
----------------------------------
Fix Version/s: 2.4.5
2.3.6
> Balancer should explain progress in a better way in log
> -------------------------------------------------------
>
> Key: HBASE-25973
> URL: https://issues.apache.org/jira/browse/HBASE-25973
> Project: HBase
> Issue Type: Bug
> Components: Balancer
> Affects Versions: 3.0.0-alpha-1
> Reporter: Clara Xiong
> Assignee: Clara Xiong
> Priority: Major
> Fix For: 2.3.6, 2.4.5
>
>
> In the log, balancer logs at info level at the beginning of run:
> {code}
> balancer.StochasticLoadBalancer: start StochasticLoadBalancer.balancer,
> initCost=277.3479243125063, functionCost=RegionCountSkewCostFunction :
> (500.0, 0.3749771215224234); ServerLocalityCostFunction : (25.0,
> 0.5807483226644186); RackLocalityCostFunction : (15.0, 0.0);
> TableSkewCostFunction : (1000.0, 0.0019704142954972883);
> StoreFileCostFunction : (200.0, 0.3668512059459341); computedMaxSteps:
> 42270438200
> {code}
> the cost is reported without context, it is hard for operator to understand
> how unbalanced the cluster is for balancer and how much progress we are
> making.
> For a large cluster, the calculation can take a long time, we also need to
> let operator understand that it will take up to the max time to complete the
> calculation.
> At the end of computation:
> {code}
> balancer.StochasticLoadBalancer: Finished computing new load balance plan.
> Computation took PT40M0.006S to try 1036409 different iterations. Found a
> solution that moves 161926 regions; Going from a computed cost of
> 118.75715593924485 to a new cost of 1.5509126920967042
> {code}
> The time to compute the plan is also printed in a format that is not human
> readable. we also need to let operator understand that balancer is just
> submitting the plan and it be up to execution to complete the move.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)