[
https://issues.apache.org/jira/browse/HBASE-25873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341968#comment-17341968
]
Duo Zhang commented on HBASE-25873:
-----------------------------------
OK, the problem is because I removed the fence in CostFromRegionLoadFunction.
We have a clusterStatus and a loads in CostFromRegionLoadFunction, they are
just used for fencing, and when computing, we just use the loads from
BalancerClusterState. So in the new code, I just removed these two fields.
But in the TestStochasticLoadBalancerLargeCluster, we will not call
setClusterMetrics, so in the old code, we will skip all the sub classes for
CostFromRegionLoadFunction, while in the new code, we will go into these
classes. If I removed this fence
{code}
if (clusterStatus == null || loads == null) {
return 0;
}
{code}
The performance will be the same.
In general, I do not think in real cluster we will have null ClusterMetrics and
RegionLoad, as we have a background chore to periodically update the
ClusterMetrics for balancer, so there is no actual performance gain. And maybe
we should call setClusterMetrics in the UTs to let it represent the real work
load.
Of course, there is still another problem that, why CostFromRegionLoadFunction
can slow down the speed so much.
Let me dig more.
Thanks.
> Revisit the implementation of CostFunctions
> -------------------------------------------
>
> Key: HBASE-25873
> URL: https://issues.apache.org/jira/browse/HBASE-25873
> Project: HBase
> Issue Type: Sub-task
> Components: Balancer, Performance
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)