[ 
https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689060#comment-15689060
 ] 

Charlie Qiangeng Xu edited comment on HBASE-17110 at 11/23/16 6:10 AM:
-----------------------------------------------------------------------

Hi [~zghaobac], thanks for pointing these out.
{quote}
    If we decide this is a default strategy, this method seems doesn't need 2 
arguments?
{quote}
We indeed don't need two variables for simpleLoadBalancer, but unfortunately 
the method is shared by StochasticLoadBalancer and other balancers as well. 
Even if StochasticLoadBalancer doesn't need "byTable" anymore, we still at 
least should accommodate to some existing customized balancers that some users 
may have in place.

{quote}
   This config is not necessary if this is default strategy?
{quote}
This config here is for the strategy itself and would be helpful for a power 
user.
I deliberately added this one since it provides better control to the threshold 
of the cluster level load difference, which, usually is more tolerable than the 
table level. 
For most of the user, overallSlop is just same as slop by default.






was (Author: xharlie):
Hi Guanghao, thanks for pointing these out.
{quote}
    If we decide this is a default strategy, this method seems doesn't need 2 
arguments?
{quote}
We indeed don't need two variable for simpleLoadBalancer, but unfortunately the 
method is shared by stochasticLoadBalancer 
and other balancers as well. Even if stochasticLoadBalancer doesn't need 
"byTable" anymore, we still at least should 
accommodate to some existing customized balancers that some users may have in 
place.

{quote}
   This config is not necessary if this is default strategy?
{quote}
This config here is for the strategy itself and would be helpful for a power 
user.
I deliberately add this one since it provides better control to the threshold 
of the cluster level load difference, 
which, usually is more tolerable than the table level. 
For most of the user, overallSlop is just same as slop by default.





> Add an "Overall Strategy" option(balanced both on table level and server 
> level) to SimpleLoadBalancer
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-17110
>                 URL: https://issues.apache.org/jira/browse/HBASE-17110
>             Project: HBase
>          Issue Type: New Feature
>          Components: Balancer
>    Affects Versions: 2.0.0, 1.2.4
>            Reporter: Charlie Qiangeng Xu
>            Assignee: Charlie Qiangeng Xu
>         Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, 
> HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch
>
>
> This jira is about an enhancement of simpleLoadBalancer. Here we introduce a 
> new strategy: "bytableOverall" which could be controlled by adding:
> {noformat}
> <property>
>   <name>hbase.master.loadbalance.bytableOverall</name>
>   <value>true</value>
> </property>
> {noformat}
> We have been using the strategy on our largest cluster for several months. 
> it's proven to be very helpful and stable, especially, the result is quite 
> visible to the users.
> Here is the reason why it's helpful:
> When operating large scale clusters(our case), some companies still prefer to 
> use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan 
> generation, etc. Current SimpleLoadBalancer has two modes: 
> 1. byTable, which only guarantees that the regions of one table could be 
> uniformly distributed. 
> 2. byCluster, which ignores the distribution within tables and balance the 
> regions all together.
> If the pressures on different tables are different, the first byTable option 
> is the preferable one in most case. Yet, this choice sacrifice the cluster 
> level balance and would cause some servers to have significantly higher load, 
> e.g. 242 regions on server A but 417 regions on server B.(real world stats)
> Consider this case,  a cluster has 3 tables and 4 servers:
> {noformat}
>   server A has 3 regions: table1:1, table2:1, table3:1
>   server B has 3 regions: table1:2, table2:2, table3:2
>   server C has 3 regions: table1:3, table2:3, table3:3
>   server D has 0 regions.
> {noformat}
> From the byTable strategy's perspective, the cluster has already been 
> perfectly balanced on table level. But a perfect status should be like:
> {noformat}
>   server A has 2 regions: table2:1, table3:1
>   server B has 2 regions: table1:2, table3:2
>   server C has 3 regions: table1:3, table2:3, table3:3
>   server D has 2 regions: table1:1, table2:2
> {noformat}
> We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, 
> table2 and table3 still keep balanced.   
> And this is what the new mode "byTableOverall" can achieve.
> Two UTs have been added as well and the last one demonstrates the advantage 
> of the new strategy.
> Also, a onConfigurationChange method has been implemented to hot control the 
> "slop" variable.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to