[
https://issues.apache.org/jira/browse/HBASE-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang updated HBASE-25767:
------------------------------
Hadoop Flags: Reviewed
Release Note:
In the actual implementation classes of andidateGenerator, now we just random
select a start point and then iterate sequentially, instead of using the old
way, where we will create a big array to hold all the integers in [0, n),
shuffle the array, and then iterate on the array.
The new implementation is 'random' enough as every time we just select one
candidate. The problem for the old implementation is that, it will create an
array every time when we want to get a candidate, if we have tens of thousands
regions, we will create an array with tens of thousands length everytime, which
causes big GC pressure and slow down the balancer execution.
> CandidateGenerator.getRandomIterationOrder is too slow on large cluster
> -----------------------------------------------------------------------
>
> Key: HBASE-25767
> URL: https://issues.apache.org/jira/browse/HBASE-25767
> Project: HBase
> Issue Type: Improvement
> Components: Balancer, Performance
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
>
> Similar to HBASE-25759, it is just used to test whether we should skip
> calculation, but in production masterServices will never be null.
> ==========
> Update, change the title of this issue for removing
> CandidateGenerator.getRandomIterationOrder as it is too slow which causes the
> CandidateGenerator.getRandomIterationOrder to fail when we remove the
> masterServices field in LocalityBasedCandidateGenerator. As this is the most
> important change in this issue so change the title of this issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)