[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2024-01-13 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806349#comment-17806349
 ] 

Bryan Beaudreault commented on HBASE-25768:
---

Moving out of 2.6.0

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
> Fix For: 3.0.0-beta-2
>
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-06-12 Thread Xiaolin Ha (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553386#comment-17553386
 ] 

Xiaolin Ha commented on HBASE-25768:


The PR is ready,its intention is master and branch-2.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-05-08 Thread Zheng Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533612#comment-17533612
 ] 

Zheng Wang commented on HBASE-25768:


[~Xiaolin Ha] Yeah, you are right, this patch is useful for some cases.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-05-06 Thread Xiaolin Ha (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532775#comment-17532775
 ] 

Xiaolin Ha commented on HBASE-25768:


Hi, [~kingWang] , the patch is for branch-2.4+. If you want to apply this patch 
for 2.1.0, most improvements since 2.1.0 is not  required here. The main 
changes here are the chosen cost functions. You can try to back port it if you 
have interest. Thanks.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-05-05 Thread Xiaolin Ha (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532088#comment-17532088
 ] 

Xiaolin Ha commented on HBASE-25768:


[~filtertip] , thanks, I think the main reason why disabled balanceByTable and 
increased multiplier for the cost of table skew work well is that the 
computedMaxSteps is far smaller in one round of balance cluster, it balances 
all the tables on the cluster in one round. 

But when enabling balanceByTable, balance cluster needs 
computedMaxSteps*tableCount in one round. It is time consuming, but it balanced 
more accurately, because it considers all the cost functions for all the table, 
instead of progressive convergence by balancing cluster time after time. Since 
all tables computed costs together, maybe you can not balance a pressure skew 
table when disabling balanceByTable.

Here if we use an overall checker to trigger coarse balance, it can increase 
the speed when the cluster/table has skew issues, because the time that 
calculate costs also should multiple the count of cost functions and time 
consuming of each cost functions.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-04-29 Thread kangTwang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529821#comment-17529821
 ] 

kangTwang commented on HBASE-25768:
---

[~filtertip] 

I've tried this parameter in the environment before. No, it's just a temporary 
scheme

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-04-29 Thread kangTwang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529818#comment-17529818
 ] 

kangTwang commented on HBASE-25768:
---

[~Xiaolin Ha]  Will there be a patch of HBase 2.1.0  version at present?  PR is 
3.x version?

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-04-28 Thread Zheng Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529777#comment-17529777
 ] 

Zheng Wang commented on HBASE-25768:


I encountered similar issue recently, a cluster has 1000+ table, when i enable 
balanceByTable, it spend several hours to do the balance, finally i disable it, 
and set hbase.master.balancer.stochastic.tableSkewCost to 1000 instead, it 
works well.

 

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-04-25 Thread Xiaolin Ha (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527488#comment-17527488
 ] 

Xiaolin Ha commented on HBASE-25768:


Thanks for your attention, [~kingWang] , I'll complete the PR in the near few 
days.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-04-24 Thread kangTwang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526990#comment-17526990
 ] 

kangTwang commented on HBASE-25768:
---

Hi [~Xiaolin Ha] :

Has this PR not been completed yet?

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-25768) Support an overall coarse and fast balance strategy for StochasticLoadBalancer

2022-03-23 Thread Xiaolin Ha (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511381#comment-17511381
 ] 

Xiaolin Ha commented on HBASE-25768:


I have attached a design doc, suggestions are welcome.

> Support an overall coarse and fast balance strategy for StochasticLoadBalancer
> --
>
> Key: HBASE-25768
> URL: https://issues.apache.org/jira/browse/HBASE-25768
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 3.0.0-alpha-1, 2.0.0, 1.4.13
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
>
> When we use StochasticLoadBalancer + balanceByTable, we could face two 
> difficulties.
>  # For each table, their regions are distributed uniformly, but for the 
> overall cluster, still exiting imbalance between RSes;
>  # When there are large-scaled restart of RSes, or expansion for groups or 
> cluster, we hope the balancer can execute as soon as possible, but the 
> StochasticLoadBalancer may need a lot of time to compute costs.
> We can detect these circumstances in StochasticLoadBalancer(such as using the 
> percentage of skew tables), and before the normal balance steps trying, we 
> can add a strategy to let it just balance like the SimpleLoadBalancer or use 
> few light cost functions here.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)