sunhelly commented on pull request #3260:
URL: https://github.com/apache/hbase/pull/3260#issuecomment-840337094


   > I am not sure if this Jira is needed. First of all, cost function should 
be implemented independently for tuning consideration. if this function is not 
needed, operator can turn it off by setting the weight to 0.
   > Secondly, as I understand, the example would cause a problem no matter how 
many tables we have. I have another Jira to improve this cost function. 
https://issues.apache.org/jira/browse/HBASE-25739 which completely rewrite the 
implementation and fix the problem completely.
   
   Hi, @clarax , thanks for attention. 
   I think the TableSkewCostFunction can only set the regions be roughly 
balanced instead of fully balanced from the perspective of one table. For 
example, if there are two tables on the cluster, the whole cluster distribution 
is [10,10,10,10], one table distribution is [10,0,10,0], while another table is 
[0,10,0,10], then the cost of TableSkewCostFunction is 10/30. But when one 
table distribution is [2,8,2,8], another is [8,2,8,2], then the cost of 
TableSkewCostFunction is 6/30, smaller than the previous value, and lower the 
possibility of generate actions. Even generate actions afterwards, how the 
balancer knows which table is the most unbalanced? 
   Balancing by table can ensure every table regions be distributed evenly. 
Hope I can fully get your ideas, I'm a little confused about some problems, can 
you tell me if you have used balance by table in your clusters, and how many 
tables whose region count is smaller than the online RS count? Thanks.
   
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to