clarax commented on pull request #3260: URL: https://github.com/apache/hbase/pull/3260#issuecomment-840371592
> > @sunhelly We used it on a production cluster with about 7 user table and a test cluster with one user table. Both have meta table which has smaller number of regions than the number of nodes.Since I use the same aggregation as regionCountSkewCostFunction, for tables with smaller number of regions say [1,0,0,0], the cost is min so no calculation would be triggered. > > I simply add the scaled cost for each table so balancer will keep moving until all tables are balanced enough. > > Yes, the TableSkewCostFunction should use similar aggregation calculation like RegionCountSkewCostFunction, while it should calculate from the perspective of each table. But I think the calculation may be of low efficiency and won't have big impact on subsequent balancer. For example, if there are 100 tables on a cluster, 98 tables of them have balanced layout, while only 2 is imbalanced. The TableSkewCostFunction can perceive this problem, but the balancer can not choose regions of these two regions to generate actions, and only 2% chose regions belong to these 2 tables. After max steps, if these 2 tables still imbalanced, the balancer should keep generate actions in the next cycle as you have mentioned before. > This is a problem of gradual balance, it's hard to know when these two tables can be balanced, and what a proper number of minCostNeedBalance should be...I think trying to set "hbase.master.loadbalance.bytable" be true can solve this problem simply... That is a valid concern. I am working on choosing the proper minCostNeedBalance. Please see the umbrella Jira https://issues.apache.org/jira/browse/HBASE-25697 and your input is welcome. byTable option doesn't work though. Can we move the discussion in the other jiras since it is beyond the scope of this pr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
