Clara Xiong created HBASE-25625:
-----------------------------------

             Summary: StochasticBalancer CostFunctions needs a better way to 
evaluate resource distribution
                 Key: HBASE-25625
                 URL: https://issues.apache.org/jira/browse/HBASE-25625
             Project: HBase
          Issue Type: Improvement
          Components: Balancer, master
            Reporter: Clara Xiong


Currently CostFunctions including RegionCountSkewCostFunctions, 
PrimaryRegionCountSkewCostFunctions and all load cost functions calculate how 
uneven the distribution by getting the sum of deviation per region server. 
TableSkewCostFunction uses the sum of the max region per server for all tables 
as the measure of unevenness. 

This simple implementation works when the cluster is small. But when the 
cluster get larger with more region servers and regions, it doesn't work well 
with hot spots or a small number of unbalanced servers.

The proposal is to use the standard deviation of the count per region server to 
capture the existence of a small portion of region servers with overwhelming 
load/allocation.

Patch is in test and will follow shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to