Ray Mattingly created HBASE-28513: ------------------------------------- Summary: Secondary replica balancing squashes all other cost considerations Key: HBASE-28513 URL: https://issues.apache.org/jira/browse/HBASE-28513 Project: HBase Issue Type: Improvement Reporter: Ray Mattingly
I have a larger write up available [here.|https://git.hubteam.com/gist/rmattingly/8bc9cbe7c422db12ffc9cd1825069bd7] Basically there are a few cost functions with relatively huge default multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a multiplier of 5. Having any multiplier of 100k, while others are single digit, basically makes the latter category totally irrelevant from balancer considerations. I understand that it's critical to distribute a region's replicas across multiple hosts/racks, but I don't think we should do this at the expense of all other balancer considerations. For example, maybe we could have two types of balancer considerations: costs (as we do now), and conditionals (for the more discrete considerations, like ">1 replica of the same region should not exist on a single host"). This would allow us to prioritize replica distribution _and_ maintain consideration for things like storefile balance. -- This message was sent by Atlassian Jira (v8.20.10#820010)