clarax commented on pull request #3260:
URL: https://github.com/apache/hbase/pull/3260#issuecomment-840371592


   > > @sunhelly We used it on a production cluster with about 7 user table and 
a test cluster with one user table. Both have meta table which has smaller 
number of regions than the number of nodes.Since I use the same aggregation as 
regionCountSkewCostFunction, for tables with smaller number of regions say 
[1,0,0,0], the cost is min so no calculation would be triggered.
   > > I simply add the scaled cost for each table so balancer will keep moving 
until all tables are balanced enough.
   > 
   > Yes, the TableSkewCostFunction should use similar aggregation calculation 
like RegionCountSkewCostFunction, while it should calculate from the 
perspective of each table. But I think the calculation may be of low efficiency 
and won't have big impact on subsequent balancer. For example, if there are 100 
tables on a cluster, 98 tables of them have balanced layout, while only 2 is 
imbalanced. The TableSkewCostFunction can perceive this problem, but the 
balancer can not choose regions of these two regions to generate actions, and 
only 2% chose regions belong to these 2 tables. After max steps, if these 2 
tables still imbalanced, the balancer should keep generate actions in the next 
cycle as you have mentioned before.
   > This is a problem of gradual balance, it's hard to know when these two 
tables can be balanced, and what a proper number of minCostNeedBalance should 
be...I think trying to set "hbase.master.loadbalance.bytable" be true can solve 
this problem simply...
   
   That is a valid concern. I am working on choosing the proper 
minCostNeedBalance. Please see the umbrella Jira 
https://issues.apache.org/jira/browse/HBASE-25697  and your input is welcome. 
byTable option doesn't work though. Can we move the discussion in the other 
jiras since it is beyond the scope of this pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to