[ 
https://issues.apache.org/jira/browse/HBASE-23073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986769#comment-16986769
 ] 

Pierre Zemb commented on HBASE-23073:
-------------------------------------

https://github.com/apache/hbase/pull/894

> Add an optional costFunction to balance regions according to a capacity rule
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-23073
>                 URL: https://issues.apache.org/jira/browse/HBASE-23073
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 3.0.0
>            Reporter: Pierre Zemb
>            Assignee: Pierre Zemb
>            Priority: Minor
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: HBASE-23073.branch-1.0002.patch, 
> HBASE-23073.branch-1.001.patch
>
>
> Based on the work in 
> [HBASE-22618|https://issues.apache.org/jira/browse/HBASE-22618], users can 
> now load custom costFunctions inside the main balancer used by HBase. As an 
> example, we like like to add upstream an optional cost function called 
> HeterogeneousRegionCountCostFunction that will deal with our issue: how to 
> balance regions according to the capacity of a RS instead of using the 
> RegionCountSkewCostFunction that is trying to avoid skew.
> A rule file is loaded from HDFS before balancing. It contains lines of rules. 
> A rule is composed of a regexp for hostname, and a limit. For example, we 
> could have:
> * rs[0-9] 200
> * rs1[0-9] 50 
> RegionServers with hostname matching the first rules will have a limit of 
> 200, and the others 50. If there's no match, a default is set.
> Thanks to the rule, we have two informations: the max number of regions for 
> this cluster, and the rules for each servers. HeterogeneousBalancer will try 
> to balance regions according to their capacity.
> Let's take an example. Let's say that we have 20 RS:
>     10 RS, named through rs0 to rs9 loaded with 60 regions each, and each can 
> handle 200 regions.
>     10 RS, named through rs10 to rs19 loaded with 60 regions each, and each 
> can support 50 regions.
> Based on the following rules: 
>     rs[0-9] 200
>     rs1[0-9] 50
> The second group is overloaded, whereas the first group has plenty of space. 
> Moving a region from the first group to the second should provide a lower 
> cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to