[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-17110: -------------------------- Fix Version/s: 1.4.0 > Improve SimpleLoadBalancer to always take server-level balance into account > --------------------------------------------------------------------------- > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer > Affects Versions: 2.0.0, 1.2.4 > Reporter: Charlie Qiangeng Xu > Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-17110.patch, HBASE-17110-V2.patch, > HBASE-17110-V3.patch, HBASE-17110-V4.patch, HBASE-17110-V5.patch, > HBASE-17110-V6.patch, HBASE-17110-V7.patch, HBASE-17110-V8.patch > > > Currently with bytable strategy there might still be server-level imbalance > and we will improve this in this JIRA. > Some more background: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. And this is the goal this JIRA tries > to achieve. > Two UTs will be added as well with the last one demonstrating advantage of > the new strategy. Also, a onConfigurationChange method will be implemented to > hot control the "slop" variable. > We have been using the strategy on our largest cluster for several months, so > the effect could be assured to some extent. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)