[ 
https://issues.apache.org/jira/browse/HBASE-25873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341968#comment-17341968
 ] 

Duo Zhang commented on HBASE-25873:
-----------------------------------

OK, the problem is because I removed the fence in CostFromRegionLoadFunction.

We have a clusterStatus and a loads in CostFromRegionLoadFunction, they are 
just used for fencing, and when computing, we just use the loads from 
BalancerClusterState. So in the new code, I just removed these two fields.

But in the TestStochasticLoadBalancerLargeCluster, we will not call 
setClusterMetrics, so in the old code, we will skip all the sub classes for 
CostFromRegionLoadFunction, while in the new code, we will go into these 
classes. If I removed this fence

{code}
    if (clusterStatus == null || loads == null) {
      return 0;
    }
{code}

The performance will be the same.

In general, I do not think in real cluster we will have null ClusterMetrics and 
RegionLoad, as we have a background chore to periodically update the 
ClusterMetrics for balancer, so there is no actual performance gain. And maybe 
we should  call setClusterMetrics in the UTs to let it represent the real work 
load.

Of course, there is still another problem that, why CostFromRegionLoadFunction 
can slow down the speed so much.

Let me dig more.

Thanks.

> Revisit the implementation of CostFunctions
> -------------------------------------------
>
>                 Key: HBASE-25873
>                 URL: https://issues.apache.org/jira/browse/HBASE-25873
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Balancer, Performance
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to