rmdmattingly commented on code in PR #6593:
URL: https://github.com/apache/hbase/pull/6593#discussion_r1912451829


##########
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CostFunction.java:
##########
@@ -25,7 +26,9 @@
 @InterfaceAudience.Private
 abstract class CostFunction {
 
-  public static final double COST_EPSILON = 0.0001;
+  public static double getCostEpsilon(double cost) {
+    return Math.ulp(cost);
+  }

Review Comment:
   The test failures were difficult to repro because they were caused by the 
imprecise cost epsilon in our cost functions — the inaccuracy caused our 
`areRegionReplicasColocated` method to sometimes return a false negative, so if 
your balancer run ended at the wrong time then it may not start up again to 
eliminate your final bad replica placements. This was previously covered up by 
just having a bizarrely long test run, and by fixing this bug we've moved the 
expected runtime of `TestStochasticLoadBalancerRegionReplicaHighReplication` 
from 2min to 5sec
   
   I've updated the cost epsilon to be dynamically calculated using Math#ulp 
which should let us eliminate floating point calculation errors in a much more 
precise way.
   
   This PR definitely has a ton of distinct changes in it at this point, so I 
would be happy to stack these changes in a feature branch. Spitballing the PRs 
that I would have if I broke this up:
   1. Fix cost epsilon https://github.com/apache/hbase/pull/6597
   2. Get rid of fragile candidate generator enum ordinal setup, fix generator 
picking fairness https://github.com/apache/hbase/pull/6598
   3. Fix awareness of rack colocation in `areReplicasColocated`
   4. [Fix null multiplier 
possibility](https://github.com/apache/hbase/pull/6593/files#r1909329933)
   5. Add region plan conditional framework, plus 1 conditional+generator+tests 
(probably replica distribution)
   6. Add meta table isolation conditional+generator+tests
   7. Add system table isolation conditional+generator+tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to