clarax commented on a change in pull request #3724:
URL: https://github.com/apache/hbase/pull/3724#discussion_r728646989



##########
File path: 
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java
##########
@@ -66,17 +66,21 @@ void applyCostsChange(Consumer<double[]> consumer) {
   }
 
   private static double computeCost(double[] stats) {
+    if (stats == null || stats.length == 0) {
+      return 0;
+    }
     double totalCost = 0;
     double total = getSum(stats);
 
     double count = stats.length;
     double mean = total / count;
-
     for (int i = 0; i < stats.length; i++) {
       double n = stats[i];
-      double diff = Math.abs(mean - n);
+      double diff = (mean - n) * (mean - n);
       totalCost += diff;
     }
+    // No need to compute standard deviation with division by cluster size 
when scaling.
+    totalCost = Math.sqrt(totalCost);

Review comment:
       Using the standard deviation instead of linear deviation to assign 
higher penalty on outliers and therefore unstuck balancer when even region 
count distribution cannot be achieved with other constraint such as rack/host 
constraints 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to