[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charlie Qiangeng Xu updated HBASE-17110: ---------------------------------------- Attachment: SimpleBalancerBytableOverall.V1 > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > ----------------------------------------------------------------------------------------------------- > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer > Affects Versions: 2.0.0, 1.2.4 > Reporter: Charlie Qiangeng Xu > Assignee: Charlie Qiangeng Xu > Attachments: SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > <property> > <name>hbase.master.loadbalance.bytableOverall</name> > <value>true</value> > </property> > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use SimpleLoadBalancer due to its simplicity, quick balance plan generation, > etc. Current SimpleLoadBalancer has two mode: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is preferable one in most case. Yet, this choice sacrifice the cluster level > balance and would cause some servers to have significantly higher load, e.g. > 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)