PierreZ opened a new pull request #677: HBASE-23073 Add an optional 
costFunction to balance regions according to a capacity rule
URL: https://github.com/apache/hbase/pull/677
 
 
   
   
   Based on the work in HBASE-22618, users can now load custom costFunctions 
inside the main balancer used by HBase. As an example, we like like to add 
upstream an optional cost function called HeterogeneousRegionCountCostFunction 
that will deal with our issue: how to balance regions according to the capacity 
of a RS instead of using the RegionCountSkewCostFunction that is trying to 
avoid skew.
   
   A rule file is loaded from HDFS before balancing. It contains lines of 
rules. A rule is composed of a regexp for hostname, and a limit. For example, 
we could have:
   
       rs[0-9] 200
   
       rs1[0-9] 50
   
   RegionServers with hostname matching the first rules will have a limit of 
200, and the others 50. If there's no match, a default is set.
   
   Thanks to the rule, we have two informations: the max number of regions for 
this cluster, and the rules for each servers. HeterogeneousBalancer will try to 
balance regions according to their capacity.
   
   Let's take an example. Let's say that we have 20 RS:
   
   10 RS, named through rs0 to rs9 loaded with 60 regions each, and each can 
handle 200 regions.
   10 RS, named through rs10 to rs19 loaded with 60 regions each, and each can 
support 50 regions.
   
   Based on the following rules:
   
   rs[0-9] 200
   
   rs1[0-9] 50
   
   The second group is overloaded, whereas the first group has plenty of space. 
Moving a region from the first group to the second should provide a lower cost.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to