Ray Mattingly created HBASE-28513:
-------------------------------------

             Summary: Secondary replica balancing squashes all other cost 
considerations
                 Key: HBASE-28513
                 URL: https://issues.apache.org/jira/browse/HBASE-28513
             Project: HBase
          Issue Type: Improvement
            Reporter: Ray Mattingly


I have a larger write up available 
[here.|https://git.hubteam.com/gist/rmattingly/8bc9cbe7c422db12ffc9cd1825069bd7]

Basically there are a few cost functions with relatively huge default 
multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default 
multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a 
multiplier of 5. Having any multiplier of 100k, while others are single digit, 
basically makes the latter category totally irrelevant from balancer 
considerations.

I understand that it's critical to distribute a region's replicas across 
multiple hosts/racks, but I don't think we should do this at the expense of all 
other balancer considerations.

For example, maybe we could have two types of balancer considerations: costs 
(as we do now), and conditionals (for the more discrete considerations, like 
">1 replica of the same region should not exist on a single host"). This would 
allow us to prioritize replica distribution _and_ maintain consideration for 
things like storefile balance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to