[ 
https://issues.apache.org/jira/browse/KUDU-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788260#comment-17788260
 ] 

ASF subversion and git services commented on KUDU-3497:
-------------------------------------------------------

Commit d8467c571a5e7eecaf689c9c9647851ce9bf0fd1 in kudu's branch 
refs/heads/master from 宋家成
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=d8467c571 ]

KUDU-3497 optimize leader rebalancer algorithm

Leader rebalancing might not work well for now, especially for
the tables with smaller number of hash partitions.
For instance, for a table, consisting of 9 tablets, RF = 3, in a3-tservers 
cluster.

Its leaders distribution is as follow:

Tablet server A : 4
Tablet server B : 4
Tablet server C : 1

According to the algorithm for now, there will not be any rebalance
operation scheduled.

Therefore, try to find a better algorithm to make it always find
the best leader distribution.

The formula is:
expected leader num = (tablets sum) % (tablets server num) = 0 ?
(tablets sum) / (tablets server num) :
ceil((tablets sum) / (tablets server num))
A tserver whose leader num is more than the expected value needs
to transfer the leaderships.

So the leader skew will never be more than 1.

Change-Id: I0f1fe796fd98da2d8764da793b7e254319e6348a
Reviewed-on: http://gerrit.cloudera.org:8080/20310
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <[email protected]>


> Leader rebalance not working well
> ---------------------------------
>
>                 Key: KUDU-3497
>                 URL: https://issues.apache.org/jira/browse/KUDU-3497
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Song Jiacheng
>            Priority: Major
>         Attachments: KUDU-3497.patch, image-2023-07-28-18-26-59-763.png
>
>
> Leader rebalance is a thread who is trying to make leaders of each table 
> balance. But there is some situation where it does not work.
> For instance, for a table, consist of 9 tablets, 3 replication factor, in a 
> 3-tservers cluster.
> Its leaders distribution is as follow:
> Tablet server A : 4
> Tablet server B : 4
> Tablet server C : 1
> !image-2023-07-28-18-26-59-763.png!
> According to the algorithm for now, there will not be any rebalance operation 
> scheduled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to