[
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529777#comment-17529777
]
Zheng Wang commented on HBASE-25768:
------------------------------------
I encountered similar issue recently, a cluster has 1000+ table, when i enable
balanceByTable, it spend several hours to do the balance, finally i disable it,
and set hbase.master.balancer.stochastic.tableSkewCost to 1000 instead, it
works well.
> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> ------------------------------------------------------------------------------
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
> Issue Type: Improvement
> Components: Balancer
> Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two
> difficulties.
> # For each table, their regions are distributed uniformly, but for the
> overall cluster, still exiting imbalance between RSes;
> # When there are large-scaled restart of RSes, or expansion for groups or
> cluster, we hope the balancer can execute as soon as possible, but the
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the
> percentage of skew tables), and before the normal balance steps trying, we
> can add a strategy to let it just balance like the SimpleLoadBalancer or use
> few light cost functions here.
>
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)