[jira] [Commented] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

Ted Yu (JIRA) Tue, 13 Mar 2018 10:59:34 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397380#comment-16397380
 ]


Ted Yu commented on HBASE-20186:
--------------------------------

Build error was not related:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.2:install (default-install) 
on project hbase-thrift: Failed to install metadata 
org.apache.hbase:hbase-thrift:3.0.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hbase/hbase-thrift/3.0.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got / (position: END_TAG 
seen ...</metadata>\n/... @25:2)  -> [Help 1]
{code}
Can you format the code ?
e.g. leave a space between {{if}} and left parenthesis.

Otherwise looks good.

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when 
> calculating cluster state for each rsgroup
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20186
>                 URL: https://issues.apache.org/jira/browse/HBASE-20186
>             Project: HBase
>          Issue Type: Improvement
>          Components: rsgroup
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
>         Attachments: HBASE-20186.master.000.patch
>
>
> In RSGroupBasedLoadBalancer
> {code}
> public List<RegionPlan> balanceCluster(Map<ServerName, List<RegionInfo>> 
> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions 
> which have been already placed according to the rsgroup assignment, and it is 
> calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite 
> efficient, as it does not make good use of the fact that servers in 
> RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
>     if(curr.getAddress().equals(sName)) {
>       groupClusterState.put(curr, correctedState.get(curr));
>     }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each 
> rsgroup in average, the code above has time complexity as O(m * n), while 
> using TreeSet's contains(), the time complexity could be reduced to O (m * 
> logn).
> Another improvement is we do not need to scan every server for each rsgroup. 
> If the processed server could be recorded,  we could skip those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

Reply via email to