rmdmattingly commented on code in PR #6593: URL: https://github.com/apache/hbase/pull/6593#discussion_r1912451829
########## hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CostFunction.java: ########## @@ -25,7 +26,9 @@ @InterfaceAudience.Private abstract class CostFunction { - public static final double COST_EPSILON = 0.0001; + public static double getCostEpsilon(double cost) { + return Math.ulp(cost); + } Review Comment: The test failures were difficult to repro because they were caused by the imprecise cost epsilon in our cost functions — the inaccuracy caused our `areRegionReplicasColocated` method to sometimes return a false negative, so if your balancer run ended at the wrong time then it may not start up again to eliminate your final bad replica placements. This was previously covered up by just having a bizarrely long test run, and by fixing this bug we've moved the expected runtime of `TestStochasticLoadBalancerRegionReplicaHighReplication` from 2min to 5sec I've updated the cost epsilon to be dynamically calculated using Math#ulp which should let us eliminate floating point calculation errors in a much more precise way. This PR definitely has a ton of distinct changes in it at this point, so I would be happy to stack these changes in a feature branch. Spitballing the PRs that I would have if I broke this up: 1. Fix cost epsilon https://github.com/apache/hbase/pull/6597 2. Get rid of fragile candidate generator enum ordinal setup, fix generator picking fairness https://github.com/apache/hbase/pull/6598 3. Fix awareness of rack colocation in `areReplicasColocated` 4. [Fix null multiplier possibility](https://github.com/apache/hbase/pull/6593/files#r1909329933) 5. Add region plan conditional framework, plus 1 conditional+generator+tests (probably replica distribution) 6. Add meta table isolation conditional+generator+tests 7. Add system table isolation conditional+generator+tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org