[ 
https://issues.apache.org/jira/browse/RATIS-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaijie Chen updated RATIS-1769:
-------------------------------
    Description: 
This is a followup of RATIS-1762. -TransferCommand should not change priority 
of peers (or at least not by default).-

-Sadly this will break backward compatibility. But version 3.0 hasn't been 
released, so it might be OK.-

-Add a new TransferLeadershipCommand which will not change priority of peers 
when transfer leadership.-
-The old TransferCommand is deprecated and keeped as is for backward 
compatibility reasons.-

Try to avoid changing priorities before transfer leadership in TransferCommand.
It will fallback to "transfer leadership by changing priority" for backward 
compatibility.
h3. Example
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
[main] INFO org.reflections.Reflections - Reflections took 122 ms to scan 1 
urls, producing 5 keys and 18 values
[main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
MetricRegistries implementations: class 
org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first found 
implementation: org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
Transferring leadership to peer n1 with address <127.0.0.1:10124>
Transferring leadership initiated{code}
h3. Backward compatibility

Ratis shell version: {{{}3.0.0-SNAPSHOT{}}}.
Ratis server version: {{{}2.4.1{}}}.
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
[main] INFO org.reflections.Reflections - Reflections took 135 ms to scan 1 
urls, producing 5 keys and 18 values
[main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
MetricRegistries implementations: class 
org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first found 
implementation: org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
Transferring leadership to peer n1 with address <127.0.0.1:10124>
Changing priority of peer n1 with address <127.0.0.1:10124> to 4
Transferring leadership to peer n1 with address <127.0.0.1:10124>
Changing priority of peer n1 with address <127.0.0.1:10124> to 5
Transferring leadership initiated{code}
h3. In case of failure

In most cases, just a retry will fix the problem. And users can also set 
timeout manually by {{-timeout}} option.
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
[main] INFO org.reflections.Reflections - Reflections took 135 ms to scan 1 
urls, producing 5 keys and 18 values
[main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
MetricRegistries implementations: class 
org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first found 
implementation: org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
Transferring leadership to peer n1 with address <127.0.0.1:10124>
Failed to transfer peer n1 with address <127.0.0.1:10124>: 
org.apache.ratis.protocol.exceptions.TransferLeadershipException: 
n2@group-ABB3109A44C1: Failed to transfer leadership to n1 (timed out 3000ms): 
current leader is n2
        at 
org.apache.ratis.server.impl.TransferLeadership$PendingRequest.complete(TransferLeadership.java:67)
        at 
org.apache.ratis.server.impl.TransferLeadership.lambda$finish$7(TransferLeadership.java:163)
        at java.util.Optional.ifPresent(Optional.java:159)
        at 
org.apache.ratis.server.impl.TransferLeadership.finish(TransferLeadership.java:163)
        at 
org.apache.ratis.server.impl.TransferLeadership.lambda$start$4(TransferLeadership.java:136)
        at 
org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
        at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
        at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:79)
        at org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505){code}

  was:
This is a followup of RATIS-1762. -TransferCommand should not change priority 
of peers (or at least not by default).-

-Sadly this will break backward compatibility. But version 3.0 hasn't been 
released, so it might be OK.-

-Add a new TransferLeadershipCommand which will not change priority of peers 
when transfer leadership.-
-The old TransferCommand is deprecated and keeped as is for backward 
compatibility reasons.-

Try to avoid changing priorities before transfer leadership in TransferCommand.
It will fallback to "transfer leadership by changing priority" for backward 
compatibility.
h3. Example
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124[main] 
INFO org.reflections.Reflections - Reflections took 157 ms to scan 1 urls, 
producing 5 keys and 18 values[main] WARN 
org.apache.ratis.metrics.MetricRegistries - Found multiple MetricRegistries 
implementations: class org.apache.ratis.metrics.impl.MetricRegistriesImpl, 
class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
found implementation: 
org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849Transferring 
leadership to server with address <127.0.0.1:10124> Transferring leadership 
initiated{code}
h3. Backward compatibility

Ratis shell version: {{{}3.0.0-SNAPSHOT{}}}.
Ratis server version: {{{}2.4.1{}}}.
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124[main] 
INFO org.reflections.Reflections - Reflections took 131 ms to scan 1 urls, 
producing 5 keys and 18 values[main] WARN 
org.apache.ratis.metrics.MetricRegistries - Found multiple MetricRegistries 
implementations: class org.apache.ratis.metrics.impl.MetricRegistriesImpl, 
class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
found implementation: 
org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849Transferring 
leadership to server with address <127.0.0.1:10124> Changing priority of 
<127.0.0.1:10124> to 2: Transferring leadership to server with address 
<127.0.0.1:10124> Changing priority of <127.0.0.1:10124> to 3: Transferring 
leadership initiated{code}
h3. In case of failure

In most cases, just a retry will fix the problem. And users can also set 
timeout manually by {{-timeout}} option.
{code:java}
$ bin/ratis sh election transfer -peers 
127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124[main] 
INFO org.reflections.Reflections - Reflections took 135 ms to scan 1 urls, 
producing 5 keys and 18 values[main] WARN 
org.apache.ratis.metrics.MetricRegistries - Found multiple MetricRegistries 
implementations: class org.apache.ratis.metrics.impl.MetricRegistriesImpl, 
class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
found implementation: 
org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849Transferring 
leadership to server with address <127.0.0.1:10124> Failed to transfer peer n1 
with address 127.0.0.1:10124: 
org.apache.ratis.protocol.exceptions.TransferLeadershipException: 
n2@group-ABB3109A44C1: Failed to transfer leadership to n1 (timed out 3000ms): 
current leader is n2 at 
org.apache.ratis.server.impl.TransferLeadership$PendingRequest.complete(TransferLeadership.java:67)
 at 
org.apache.ratis.server.impl.TransferLeadership.lambda$finish$7(TransferLeadership.java:163)
 at java.util.Optional.ifPresent(Optional.java:159) at 
org.apache.ratis.server.impl.TransferLeadership.finish(TransferLeadership.java:163)
 at 
org.apache.ratis.server.impl.TransferLeadership.lambda$start$4(TransferLeadership.java:136)
 at 
org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101) at 
org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38) at 
org.apache.ratis.util.LogUtils$1.run(LogUtils.java:79) at 
org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55) at 
java.util.TimerThread.mainLoop(Timer.java:555) at 
java.util.TimerThread.run(Timer.java:505){code}


> Avoid changing priorities in TransferCommand unless necessary
> -------------------------------------------------------------
>
>                 Key: RATIS-1769
>                 URL: https://issues.apache.org/jira/browse/RATIS-1769
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: Kaijie Chen
>            Priority: Major
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This is a followup of RATIS-1762. -TransferCommand should not change priority 
> of peers (or at least not by default).-
> -Sadly this will break backward compatibility. But version 3.0 hasn't been 
> released, so it might be OK.-
> -Add a new TransferLeadershipCommand which will not change priority of peers 
> when transfer leadership.-
> -The old TransferCommand is deprecated and keeped as is for backward 
> compatibility reasons.-
> Try to avoid changing priorities before transfer leadership in 
> TransferCommand.
> It will fallback to "transfer leadership by changing priority" for backward 
> compatibility.
> h3. Example
> {code:java}
> $ bin/ratis sh election transfer -peers 
> 127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
> [main] INFO org.reflections.Reflections - Reflections took 122 ms to scan 1 
> urls, producing 5 keys and 18 values
> [main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
> MetricRegistries implementations: class 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
> org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
> found implementation: 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
> Transferring leadership to peer n1 with address <127.0.0.1:10124>
> Transferring leadership initiated{code}
> h3. Backward compatibility
> Ratis shell version: {{{}3.0.0-SNAPSHOT{}}}.
> Ratis server version: {{{}2.4.1{}}}.
> {code:java}
> $ bin/ratis sh election transfer -peers 
> 127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
> [main] INFO org.reflections.Reflections - Reflections took 135 ms to scan 1 
> urls, producing 5 keys and 18 values
> [main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
> MetricRegistries implementations: class 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
> org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
> found implementation: 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
> Transferring leadership to peer n1 with address <127.0.0.1:10124>
> Changing priority of peer n1 with address <127.0.0.1:10124> to 4
> Transferring leadership to peer n1 with address <127.0.0.1:10124>
> Changing priority of peer n1 with address <127.0.0.1:10124> to 5
> Transferring leadership initiated{code}
> h3. In case of failure
> In most cases, just a retry will fix the problem. And users can also set 
> timeout manually by {{-timeout}} option.
> {code:java}
> $ bin/ratis sh election transfer -peers 
> 127.0.0.1:10024,127.0.0.1:10124,127.0.0.1:11124 -address 127.0.0.1:10124
> [main] INFO org.reflections.Reflections - Reflections took 135 ms to scan 1 
> urls, producing 5 keys and 18 values
> [main] WARN org.apache.ratis.metrics.MetricRegistries - Found multiple 
> MetricRegistries implementations: class 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl, class 
> org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl. Using first 
> found implementation: 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl@1e67a849
> Transferring leadership to peer n1 with address <127.0.0.1:10124>
> Failed to transfer peer n1 with address <127.0.0.1:10124>: 
> org.apache.ratis.protocol.exceptions.TransferLeadershipException: 
> n2@group-ABB3109A44C1: Failed to transfer leadership to n1 (timed out 
> 3000ms): current leader is n2
>       at 
> org.apache.ratis.server.impl.TransferLeadership$PendingRequest.complete(TransferLeadership.java:67)
>       at 
> org.apache.ratis.server.impl.TransferLeadership.lambda$finish$7(TransferLeadership.java:163)
>       at java.util.Optional.ifPresent(Optional.java:159)
>       at 
> org.apache.ratis.server.impl.TransferLeadership.finish(TransferLeadership.java:163)
>       at 
> org.apache.ratis.server.impl.TransferLeadership.lambda$start$4(TransferLeadership.java:136)
>       at 
> org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
>       at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
>       at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:79)
>       at org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
>       at java.util.TimerThread.mainLoop(Timer.java:555)
>       at java.util.TimerThread.run(Timer.java:505){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to