[ 
https://issues.apache.org/jira/browse/HBASE-26178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-26178:
--------------------------------
    Description: 
With ~800 node and ~500 regions per node on our large production cluster, 
balancer cannot complete within hours even after we just add 2% servers after 
maintenance. 

The unit tests with larger number of regions are taking longer and longer with 
changes to balancer with recent changes too, evident with the increment of the 
time limit recent PR's included.

It is time to replace some of the data structure for better time complexity 
including:

int[][] regionsPerServer; // serverIndex -> region list

int[][] regionsPerHost; // hostIndex -> list of regions
 int[][] regionsPerRack; // rackIndex -> region list

// serverIndex -> sorted list of regions by primary region index
 ArrayList<HashSet<Integer>> primariesOfRegionsPerServer;

// hostIndex -> sorted list of regions by primary region index
 int[][] primariesOfRegionsPerHost;

// rackIndex -> sorted list of regions by primary region index
 int[][] primariesOfRegionsPerRack;

Areas of algorithm improvement include:
 # O(n ) to O(1) time to  lookup or update per server/host/rack for every move 
test iteration.(n = number of regions per server/host/rack).
 # O(n ) to O(1) time for reserse lookup of region index from primary index.
 # Recomputation of primaryRegionCountSkewCostFunction reduced from O(n ) to 
O(1)

 

  was:
With ~800 node and ~500 regions per node on our large production cluster, 
balancer cannot complete within hours even after we just add 2% servers after 
maintenance. 

The unit tests with larger number of regions are taking longer and longer with 
changes to balancer with recent changes too, evident with the increment of the 
time limit recent PR's included.

It is time to replace some of the data structure for better time complexity 
including:

int[][] regionsPerServer; // serverIndex -> region list

int[][] regionsPerHost; // hostIndex -> list of regions
 int[][] regionsPerRack; // rackIndex -> region list

// serverIndex -> sorted list of regions by primary region index
 ArrayList<HashSet<Integer>> primariesOfRegionsPerServer;

// hostIndex -> sorted list of regions by primary region index
 int[][] primariesOfRegionsPerHost;

// rackIndex -> sorted list of regions by primary region index
 int[][] primariesOfRegionsPerRack;

Areas of algorithm improvement include:
 # O(n) to O(1) time to  lookup or update per server/host/rack for every move 
test iteration.(n = number of regions per server/host/rack).
 # O(n) to O(1) time for reserse lookup of region index from primary index.
 # Recomputation of primaryRegionCountSkewCostFunction reduced from O(n) to O(1)

 


> Improve data structure and algorithm for BalanceClusterState to improve 
> computation speed for large cluster
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26178
>                 URL: https://issues.apache.org/jira/browse/HBASE-26178
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Clara Xiong
>            Priority: Major
>
> With ~800 node and ~500 regions per node on our large production cluster, 
> balancer cannot complete within hours even after we just add 2% servers after 
> maintenance. 
> The unit tests with larger number of regions are taking longer and longer 
> with changes to balancer with recent changes too, evident with the increment 
> of the time limit recent PR's included.
> It is time to replace some of the data structure for better time complexity 
> including:
> int[][] regionsPerServer; // serverIndex -> region list
> int[][] regionsPerHost; // hostIndex -> list of regions
>  int[][] regionsPerRack; // rackIndex -> region list
> // serverIndex -> sorted list of regions by primary region index
>  ArrayList<HashSet<Integer>> primariesOfRegionsPerServer;
> // hostIndex -> sorted list of regions by primary region index
>  int[][] primariesOfRegionsPerHost;
> // rackIndex -> sorted list of regions by primary region index
>  int[][] primariesOfRegionsPerRack;
> Areas of algorithm improvement include:
>  # O(n ) to O(1) time to  lookup or update per server/host/rack for every 
> move test iteration.(n = number of regions per server/host/rack).
>  # O(n ) to O(1) time for reserse lookup of region index from primary index.
>  # Recomputation of primaryRegionCountSkewCostFunction reduced from O(n ) to 
> O(1)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to