[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2019-02-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20186:
---
Fix Version/s: (was: 1.5.0)

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.4.3
>
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map> 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-16 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20186:
---
Fix Version/s: 1.4.3
   1.5.0
   2.1.0

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.3
>
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-14 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20186:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Xiang.

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Status: Patch Available  (was: Open)

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Status: Open  (was: Patch Available)

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Attachment: HBASE-20186.master.001.patch

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch, 
> HBASE-20186.master.001.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Environment: (was: In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

Another improvement is that we do not need to scan each server again and again, 
instead, we could record which servers have been processed and skip them)

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Status: Patch Available  (was: Open)

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is that we do not need to scan each server again and 
> again, instead, we could record which servers have been processed and skip 
> them
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Attachment: HBASE-20186.master.000.patch

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is that we do not need to scan each server again and 
> again, instead, we could record which servers have been processed and skip 
> them
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Attachments: HBASE-20186.master.000.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Description: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

Another improvement is we do not need to scan every server for each rsgroup. If 
the processed server could be recorded,  we could skip those.

  was:
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

Another improvement is we do not need to scan every server for each rsgroup. If 
we could record which servers have been processed,  we could skip those.


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is that we do not need to scan each server again and 
> again, instead, we could record which servers have been processed and skip 
> them
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's 

[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Description: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

Another improvement is we do not need to scan every server for each rsgroup. If 
we could record which servers have been processed,  we could skip those.

  was:
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is that we do not need to scan each server again and 
> again, instead, we could record which servers have been processed and skip 
> them
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If 

[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Environment: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

Another improvement is that we do not need to scan each server again and again, 
instead, we could record which servers have been processed and skip them

  was:
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is that we do not need to scan each server again and 
> again, instead, we could record which servers have been processed and skip 
> them
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Environment: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).
Description: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}
The second half of the function is to calculate region move plan for regions 
which have been already placed according to the rsgroup assignment, and it is 
calculated one rsgroup after another.
The following logic to check if a server belongs to the rsgroup is not quite 
efficient, as it does not make good use of the fact that servers in RSGroupInfo 
is a TreeSet.
{code}
for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}
{code}
Given there are m region servers in the cluster and n region servers for each 
rsgroup in average, the code above has time complexity as O(m * n), while using 
TreeSet's contains(), the time complexity could be reduced to O (m * logn).

  was:
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}


for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
> Environment: In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Description: 
In RSGroupBasedLoadBalancer
{code}
public List balanceCluster(Map 
clusterState)
{code}


for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}

  was:
In {
public List balanceCluster(Map 
clusterState)}

for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List balanceCluster(Map 
> clusterState)
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Description: 
In {{public List balanceCluster(Map 
clusterState)}}

for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In {{public List balanceCluster(Map 
> clusterState)}}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

2018-03-13 Thread Xiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-20186:
-
Description: 
In {
public List balanceCluster(Map 
clusterState)}

for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}

  was:
In {{public List balanceCluster(Map 
clusterState)}}

for (Address sName : info.getServers()) {
  for(ServerName curr: clusterState.keySet()) {
if(curr.getAddress().equals(sName)) {
  groupClusterState.put(curr, correctedState.get(curr));
}
  }
}


> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> --
>
> Key: HBASE-20186
> URL: https://issues.apache.org/jira/browse/HBASE-20186
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
>
> In {
> public List balanceCluster(Map 
> clusterState)}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
> if(curr.getAddress().equals(sName)) {
>   groupClusterState.put(curr, correctedState.get(curr));
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)