[ 
https://issues.apache.org/jira/browse/HBASE-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-25767:
------------------------------
    Hadoop Flags: Reviewed
    Release Note: 
In the actual implementation classes of andidateGenerator, now we just random 
select a start point and then iterate sequentially, instead of using the old 
way, where we will create a big array to hold all the integers in [0, n), 
shuffle the array, and then iterate on the array.
The new implementation is 'random' enough as every time we just select one 
candidate. The problem for the old implementation is that, it will create an 
array every time when we want to get a candidate, if we have tens of thousands 
regions, we will create an array with tens of thousands length everytime, which 
causes big GC pressure and slow down the balancer execution.

> CandidateGenerator.getRandomIterationOrder is too slow on large cluster
> -----------------------------------------------------------------------
>
>                 Key: HBASE-25767
>                 URL: https://issues.apache.org/jira/browse/HBASE-25767
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, Performance
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>
> Similar to HBASE-25759, it is just used to test whether we should skip 
> calculation, but in production masterServices will never be null.
> ==========
> Update, change the title of this issue for removing 
> CandidateGenerator.getRandomIterationOrder as it is too slow which causes the 
> CandidateGenerator.getRandomIterationOrder to fail when we remove the 
> masterServices field in LocalityBasedCandidateGenerator. As this is the most 
> important change in this issue so change the title of this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to