clarax commented on a change in pull request #3724:
URL: https://github.com/apache/hbase/pull/3724#discussion_r736888044
##########
File path:
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java
##########
@@ -66,17 +66,21 @@ void applyCostsChange(Consumer<double[]> consumer) {
}
private static double computeCost(double[] stats) {
+ if (stats == null || stats.length == 0) {
+ return 0;
+ }
double totalCost = 0;
double total = getSum(stats);
double count = stats.length;
double mean = total / count;
-
for (int i = 0; i < stats.length; i++) {
double n = stats[i];
double diff = (mean - n) * (mean - n);
totalCost += diff;
}
+ // No need to compute standard deviation with division by cluster size
when scaling.
+ totalCost = Math.sqrt(totalCost);
Review comment:
Total cost = sqrt(sum(square of the deviation)). Since we scale total
cost on min cost and max cost, linearly, I skipped the step to divided all
three of total cost, min cost and max cost by sqrt(count).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]