sunhelly commented on pull request #3260:
URL: https://github.com/apache/hbase/pull/3260#issuecomment-840380126


   > > > @sunhelly We used it on a production cluster with about 7 user table 
and a test cluster with one user table. Both have meta table which has smaller 
number of regions than the number of nodes.Since I use the same aggregation as 
regionCountSkewCostFunction, for tables with smaller number of regions say 
[1,0,0,0], the cost is min so no calculation would be triggered.
   > > > I simply add the scaled cost for each table so balancer will keep 
moving until all tables are balanced enough.
   > > 
   > > 
   > > Yes, the TableSkewCostFunction should use similar aggregation 
calculation like RegionCountSkewCostFunction, while it should calculate from 
the perspective of each table. But I think the calculation may be of low 
efficiency and won't have big impact on subsequent balancer. For example, if 
there are 100 tables on a cluster, 98 tables of them have balanced layout, 
while only 2 is imbalanced. The TableSkewCostFunction can perceive this 
problem, but the balancer can not choose regions of these two regions to 
generate actions, and only 2% chose regions belong to these 2 tables. After max 
steps, if these 2 tables still imbalanced, the balancer should keep generate 
actions in the next cycle as you have mentioned before.
   > > This is a problem of gradual balance, it's hard to know when these two 
tables can be balanced, and what a proper number of minCostNeedBalance should 
be...I think trying to set "hbase.master.loadbalance.bytable" be true can solve 
this problem simply...
   > 
   > That is a valid concern. I am working on choosing the proper 
minCostNeedBalance. Please see the umbrella Jira 
https://issues.apache.org/jira/browse/HBASE-25697 and your input is welcome. 
byTable option doesn't work though. Can we move the discussion in the other 
jiras since it is beyond the scope of this pr?
   
   Yes, of course. I have watched the issue you referred.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to