Hi! My name is Pierre, I'm working at OVH, an European cloud-provider. Our team, Observability, is heavily relying on HBase to store telemetry. We would like to open the discussion about adding into 1.4X and 2.X a new Balancer. <https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#our-situation>Our situation
The Observability team in OVH is responsible to handle logs and metrics from all servers/applications/equipments within OVH. HBase is used as the datastore for metrics. We are using an open-source software called Warp10 <https://warp10.io> to handle all the metrics coming from OVH's infrastructure. We are operating three HBase 1.4 clusters, including one with 218 RegionServers which is growing every month. We found out that *in our usecase*(single table, dedicated HBase and Hadoop tuned for our usecase, good key distribution)*, the number of regions per RS was the real limit for us*. Over the years, due to historical reasons and also the need to benchmark new machines, we ended-up with differents groups of hardware: some servers can handle only 180 regions, whereas the biggest can handle more than 900. Because of such a difference, we had to disable the LoadBalancing to avoid the roundRobinAssigmnent. We developed some internal tooling which are responsible for load balancing regions across RegionServers. That was 1.5 year ago. Today, we are thinking about fully integrate it within HBase, using the LoadBalancer interface. We started working on a new Balancer called HeterogeneousBalancer, that will be able to fullfill our need. <https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#how-does-it-works>How does it works? A rule file is loaded before balancing. It contains lines of rules. A rule is composed of a regexp for hostname, and a limit. For example, we could have: rs[0-9] 200 rs1[0-9] 50 RegionServers with hostname matching the first rules will have a limit of 200, and the others 50. If there's no match, a default is set. Thanks to the rule, we have two informations: the max number of regions for this cluster, and the rules for each servers. HeterogeneousBalancer will try to balance regions according to their capacity. Let's take an example. Let's say that we have 20 RS: - 10 RS, named through rs0 to rs9 loaded with 60 regions each, and each can handle 200 regions. - 10 RS, named through rs10 to rs19 loaded with 60 regions each, and each can support 50 regions. Based on the following rules: rs[0-9] 200 rs1[0-9] 50 The second group is overloaded, whereas the first group has plenty of space. We know that we can handle at maximum *2500 regions* (200*10 + 50*10) and we have currently *1200 regions* (60*20). HeterogeneousBalancer will understand that the cluster is *full at 48.0%* (1200/2500). Based on this information, we will then *try to put all the RegionServers to ~48% of load according to the rules.* In this case, it will move regions from the second group to the first. The balancer will: - compute how many regions needs to be moved. In our example, by moving 36 regions on rs10, we could go from 120.0% to 46.0% - select regions with lowest data-locality - try to find an appropriate RS for the region. We will take the lowest available RS. <https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#current-status>Current status We started the implementation, but it is not finished yet. we are planning to deploy it on a cluster with lower impact for testing, and then put it on our biggest cluster. We have some basic implementation of all methods, but we need to add more tests and make the code more robust. You can find the proof-of-concept here <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java>, and some early tests here <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java>, here <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerBalance.java>, and here <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerRules.java>. We wrote the balancer for our use-case, which means that: - there is one table - there is no region-replica - good key dispersion - there is no regions on master However, we believe that this will not be too complicated to implement. We are also thinking about the possibility to limit overassigments of regions by moving them to the least loaded RS. Even if the balancing strategy seems simple, we do think that having the possibility to run HBase cluster on heterogeneous hardware is vital, especially in cloud environment, because you may not be able to buy the same server specs throughout the years. What do you think about our approach? Are you interested for such a contribution? --- Pierre ZEMB - OVH Group Observability/Metrics - Infrastructure Engineer pierrezemb.fr +33 7 86 95 61 65
