Same Q as Clay asked.  We can see..

Also generically we can not consider like only one table in cluster.  At
top level we give options like balance per table level or per cluster level
only.  This also should be considered for the new balancer also IMO.  Ya if
it can work with cost function change alone, it will be much smaller
change.  On high level am +1 for such a simple way to handle the
heterogeneous nodes cluster.

Anoop

On Fri, Jun 21, 2019 at 5:15 AM Clay Baenziger (BLOOMBERG/ 731 LEX) <
[email protected]> wrote:

> Could it work to have the stochastic load balancer use pluggable cost
> functions[1]? Then, could this type of a load balancer be implemented
> simply as a new cost function which folks could choose to load and mix with
> the others?
>
> -Clay
>
> [1]: Instead of this static list of cost functions?
> https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L198
>
> From: [email protected] At: 06/20/19 12:54:23To:  [email protected]
> Subject: Re: Adding a new balancer to HBase
>
> Bonjour Pierre,
>
> Some time ago I build (for my own purposes) something similar that I called
> "LoadBasedLoadBalancer" that moves the regions based on my servers load and
> capacity. The load balancer is querying the region servers to get the
> number of cores, the allocated heap, the 5 minutes average load, etc. and
> balanced the regions based on that.
>
> I felt that need already years ago. What you are proposing is a simplified
> version that will most probably be more stable and easier to implement. I
> will be happy to assist you in the process or getting that into HBase.
>
> Have you already opened the JIRA to support that?
>
> Thanks,
>
> JMS
>
> Le jeu. 20 juin 2019 à 01:11, ramkrishna vasudevan <
> [email protected]> a écrit :
>
> > Seems a very good idea for cloud servers. Pls feel free to raise a JIRA
> and
> > contribute your patch.
> >
> > Regards
> > Ram
> >
> > On Tue, Jun 18, 2019 at 8:09 AM 刘新星 <[email protected]> wrote:
> >
> > >
> > >
> > > I'm interested on this. It sounds like a weighted load balancer and
> > > valuable for those users deploy their hbase cluster on cloud servers.
> > > You can create a jira and make a patch for better discussion.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > At 2019-06-18 05:00:54, "Pierre Zemb" <[email protected]>
> > wrote:
> > > >Hi!
> > > >
> > > >My name is Pierre, I'm working at OVH, an European cloud-provider. Our
> > > >team, Observability, is heavily relying on HBase to store telemetry.
> We
> > > >would like to open the discussion about adding into 1.4X and 2.X a new
> > > >Balancer.
> > > ><
> > >
> >
> https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#our-situation
> > > >Our
> > > >situation
> > > >
> > > >The Observability team in OVH is responsible to handle logs and
> metrics
> > > >from all servers/applications/equipments within OVH. HBase is used as
> > the
> > > >datastore for metrics. We are using an open-source software called
> > Warp10
> > > ><https://warp10.io> to handle all the metrics coming from OVH's
> > > >infrastructure. We are operating three HBase 1.4 clusters, including
> one
> > > >with 218 RegionServers which is growing every month.
> > > >
> > > >We found out that *in our usecase*(single table, dedicated HBase and
> > > Hadoop
> > > >tuned for our usecase, good key distribution)*, the number of regions
> > per
> > > >RS was the real limit for us*.
> > > >
> > > >Over the years, due to historical reasons and also the need to
> benchmark
> > > >new machines, we ended-up with differents groups of hardware: some
> > servers
> > > >can handle only 180 regions, whereas the biggest can handle more than
> > 900.
> > > >Because of such a difference, we had to disable the LoadBalancing to
> > avoid
> > > >the roundRobinAssigmnent. We developed some internal tooling which are
> > > >responsible for load balancing regions across RegionServers. That was
> > 1.5
> > > >year ago.
> > > >
> > > >Today, we are thinking about fully integrate it within HBase, using
> the
> > > >LoadBalancer interface. We started working on a new Balancer called
> > > >HeterogeneousBalancer, that will be able to fullfill our need.
> > > ><
> > >
> >
>
> https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#how-does-it-wor
> ks
> > > >How
> > > >does it works?
> > > >
> > > >A rule file is loaded before balancing. It contains lines of rules. A
> > rule
> > > >is composed of a regexp for hostname, and a limit. For example, we
> could
> > > >have:
> > > >
> > > >rs[0-9] 200
> > > >rs1[0-9] 50
> > > >
> > > >RegionServers with hostname matching the first rules will have a limit
> > of
> > > >200, and the others 50. If there's no match, a default is set.
> > > >
> > > >Thanks to the rule, we have two informations: the max number of
> regions
> > > for
> > > >this cluster, and the rules for each servers. HeterogeneousBalancer
> will
> > > >try to balance regions according to their capacity.
> > > >
> > > >Let's take an example. Let's say that we have 20 RS:
> > > >
> > > >   - 10 RS, named through rs0 to rs9 loaded with 60 regions each, and
> > each
> > > >   can handle 200 regions.
> > > >   - 10 RS, named through rs10 to rs19 loaded with 60 regions each,
> and
> > > >   each can support 50 regions.
> > > >
> > > >Based on the following rules:
> > > >
> > > >rs[0-9] 200
> > > >rs1[0-9] 50
> > > >
> > > >The second group is overloaded, whereas the first group has plenty of
> > > space.
> > > >
> > > >We know that we can handle at maximum *2500 regions* (200*10 + 50*10)
> > and
> > > >we have currently *1200 regions* (60*20). HeterogeneousBalancer will
> > > >understand that the cluster is *full at 48.0%* (1200/2500). Based on
> > this
> > > >information, we will then *try to put all the RegionServers to ~48% of
> > > load
> > > >according to the rules.* In this case, it will move regions from the
> > > second
> > > >group to the first.
> > > >
> > > >The balancer will:
> > > >
> > > >   - compute how many regions needs to be moved. In our example, by
> > moving
> > > >   36 regions on rs10, we could go from 120.0% to 46.0%
> > > >   - select regions with lowest data-locality
> > > >   - try to find an appropriate RS for the region. We will take the
> > lowest
> > > >   available RS.
> > > >
> > > ><
> > >
> >
>
> https://gist.github.com/PierreZ/15560e12c147e661e5c1b5f0edeb9282#current-status
> > > >Current
> > > >status
> > > >
> > > >We started the implementation, but it is not finished yet. we are
> > planning
> > > >to deploy it on a cluster with lower impact for testing, and then put
> it
> > > on
> > > >our biggest cluster.
> > > >
> > > >We have some basic implementation of all methods, but we need to add
> > more
> > > >tests and make the code more robust. You can find the proof-of-concept
> > > here
> > > ><
> > >
> >
>
> https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main
> /java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java
> <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java>
> > > >,
> > > >and some early tests here
> > > ><
> > >
> >
>
> https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main
> /java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java
> <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java>
> > > >,
> > > >here
> > > ><
> > >
> >
>
> https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test
>
> /java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerBalance.j
> <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerBalance.j>
> ava
> > > >,
> > > >and here
> > > ><
> > >
> >
>
> https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test
>
> /java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerRules.jav
> <https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerRules.jav>
> a
> > > >.
> > > >We wrote the balancer for our use-case, which means that:
> > > >
> > > >   - there is one table
> > > >   - there is no region-replica
> > > >   - good key dispersion
> > > >   - there is no regions on master
> > > >
> > > >However, we believe that this will not be too complicated to
> implement.
> > We
> > > >are also thinking about the possibility to limit overassigments of
> > regions
> > > >by moving them to the least loaded RS.
> > > >
> > > >Even if the balancing strategy seems simple, we do think that having
> the
> > > >possibility to run HBase cluster on heterogeneous hardware is vital,
> > > >especially in cloud environment, because you may not be able to buy
> the
> > > >same server specs throughout the years.
> > > >
> > > >What do you think about our approach? Are you interested for such a
> > > >contribution?
> > > >---
> > > >
> > > >Pierre ZEMB - OVH Group
> > > >Observability/Metrics - Infrastructure Engineer
> > > >pierrezemb.fr
> > > >+33 7 86 95 61 65
> > >
> >
>
>
>

Reply via email to