[ 
https://issues.apache.org/jira/browse/IGNITE-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-17064:
---------------------------------
    Ignite Flags:   (was: Docs Required,Release Notes Required)

> ItRaftGroupServiceTest#testTransferLeadership is flaky
> ------------------------------------------------------
>
>                 Key: IGNITE-17064
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17064
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Priority: Blocker
>              Labels: ignite-3
>
> ItRaftGroupServiceTest#testTransferLeadership is flaky with 
> {noformat}
> java.util.concurrent.ExecutionException: class 
> org.apache.ignite.raft.jraft.rpc.impl.RaftException: EBUSY:Changing the 
> configuration
> {noformat}
> There is a comment in {{NodeImpl#transferLeadershipTo}} about the situation 
> when {{EBUSY:Changing the configuration}} is thrown:
>  
> {noformat}
>             if (this.confCtx.isBusy()) {
>                 // It's very messy to deal with the case when the |peer| 
> received
>                 // TimeoutNowRequest and increase the term while somehow 
> another leader
>                 // which was not replicated with the newest configuration has 
> been
>                 // elected. If no add_peer with this very |peer| is to be 
> invoked ever
>                 // after nor this peer is to be killed, this peer will spin 
> in the voting
>                 // procedure and make the each new leader stepped down when 
> the peer
>                 // reached vote timeout and it starts to vote (because it 
> will increase
>                 // the term of the group)
>                 // To make things simple, refuse the operation and force 
> users to
>                 // invoke transfer_leadership_to after configuration changing 
> is
>                 // completed so that the peer's configuration is up-to-date 
> when it
>                 // receives the TimeOutNowRequest.
>                 LOG.warn(
>                     "Node {} refused to transfer leadership to peer {} when 
> the leader is changing the configuration.",
>                     getNodeId(), peer);
>                 return new Status(RaftError.EBUSY, "Changing the 
> configuration");
>             }
> {noformat}
> The current limitation must be investigated.
> Seems like the easiest way to fix test is to rewrite it and repeat transfer 
> leadership invocation.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to