[ 
https://issues.apache.org/jira/browse/HBASE-25882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343632#comment-17343632
 ] 

Clara Xiong commented on HBASE-25882:
-------------------------------------

Thank you for looking at this function. 

I am not sure if this Jira is needed. First of all, cost function should be 
implemented independently for tuning consideration. if this function is not 
needed, operator can turn it off by setting the weight to 0.

Secondly, as I understand, the example would cause a problem no matter how many 
tables we have. I have another Jira to improve this cost function. 
https://issues.apache.org/jira/browse/HBASE-25739 which completely rewrite the 
implementation and fix the problem completely.

> TableSkewCostFunction may cost unnecessary calculation steps when balancing 
> by table
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-25882
>                 URL: https://issues.apache.org/jira/browse/HBASE-25882
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>    Affects Versions: 3.0.0-alpha-1, 2.0.0
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>
> When using balance by table, the StochasticLoadBalancer will create the 
> cluster state according to the regions distribution of only one table. As a 
> result,  the TableSkewCostFunction should be replaced by the 
> RegionCountSkewCostFunction when the table count of the cluster state is less 
> than 2.
> The most important problem is that,  TableSkewCostFunction will cause 
> unnecessary calculation steps when there is only one table. The cost it 
> computed may be incorrect.
> For example, there are 5 online regionservers, and there is only one table 
> with exactly one region, the cluster state is [0,0,0,0,1]. Then the cost of 
> TableSkewCostFunction will be 1 (expect value is 0), because max=1, min=0.25, 
> value=1. And the computedMaxSteps will be larger than 0, some balance actions 
> will be generated to decrease the cost. But all the actions is meaningless 
> for the skew count.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to