Ray Mattingly created HBASE-29070: ------------------------------------- Summary: Balancer cost function epsilon is imprecise Key: HBASE-29070 URL: https://issues.apache.org/jira/browse/HBASE-29070 Project: HBase Issue Type: Bug Affects Versions: 2.6.1 Reporter: Ray Mattingly Assignee: Ray Mattingly
The balancer cost function is imprecise. This means that the balancer can behave in unexpected ways at certain scales. For example, with enough regions and replicas, it's very possible for the StochasticLoadBalancers#areReplicasColocated (paraphrased) method to return a false negative, and fail to trigger a balancer run despite bad replica locations existing. One immediate consequence of this bug is that our test suites are designed in slow and flappy ways. They all have huge balancer run times, but really our tests should be designed with tight balancer run times and iteration if necessary (in most cases). If we had done that, then it would've exposed this failure of the balancer to act as necessary, and our test suite would be much faster. -- This message was sent by Atlassian Jira (v8.20.10#820010)