[
https://issues.apache.org/jira/browse/KUDU-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788260#comment-17788260
]
ASF subversion and git services commented on KUDU-3497:
-------------------------------------------------------
Commit d8467c571a5e7eecaf689c9c9647851ce9bf0fd1 in kudu's branch
refs/heads/master from 宋家成
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=d8467c571 ]
KUDU-3497 optimize leader rebalancer algorithm
Leader rebalancing might not work well for now, especially for
the tables with smaller number of hash partitions.
For instance, for a table, consisting of 9 tablets, RF = 3, in a3-tservers
cluster.
Its leaders distribution is as follow:
Tablet server A : 4
Tablet server B : 4
Tablet server C : 1
According to the algorithm for now, there will not be any rebalance
operation scheduled.
Therefore, try to find a better algorithm to make it always find
the best leader distribution.
The formula is:
expected leader num = (tablets sum) % (tablets server num) = 0 ?
(tablets sum) / (tablets server num) :
ceil((tablets sum) / (tablets server num))
A tserver whose leader num is more than the expected value needs
to transfer the leaderships.
So the leader skew will never be more than 1.
Change-Id: I0f1fe796fd98da2d8764da793b7e254319e6348a
Reviewed-on: http://gerrit.cloudera.org:8080/20310
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <[email protected]>
> Leader rebalance not working well
> ---------------------------------
>
> Key: KUDU-3497
> URL: https://issues.apache.org/jira/browse/KUDU-3497
> Project: Kudu
> Issue Type: Bug
> Reporter: Song Jiacheng
> Priority: Major
> Attachments: KUDU-3497.patch, image-2023-07-28-18-26-59-763.png
>
>
> Leader rebalance is a thread who is trying to make leaders of each table
> balance. But there is some situation where it does not work.
> For instance, for a table, consist of 9 tablets, 3 replication factor, in a
> 3-tservers cluster.
> Its leaders distribution is as follow:
> Tablet server A : 4
> Tablet server B : 4
> Tablet server C : 1
> !image-2023-07-28-18-26-59-763.png!
> According to the algorithm for now, there will not be any rebalance operation
> scheduled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)