[ 
https://issues.apache.org/jira/browse/IGNITE-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-17064:
---------------------------------
    Description: 
ItRaftGroupServiceTest#testTransferLeadership is flaky with 
{{java.util.concurrent.ExecutionException: class 
org.apache.ignite.raft.jraft.rpc.impl.RaftException: EBUSY:Changing the 
configuration}}

There is a comment in {{NodeImpl#transferLeadershipTo}} about the situation 
when EBUSY:Changing the configuration is thrown:

 
{noformat}
            if (this.confCtx.isBusy()) {
                // It's very messy to deal with the case when the |peer| 
received
                // TimeoutNowRequest and increase the term while somehow 
another leader
                // which was not replicated with the newest configuration has 
been
                // elected. If no add_peer with this very |peer| is to be 
invoked ever
                // after nor this peer is to be killed, this peer will spin in 
the voting
                // procedure and make the each new leader stepped down when the 
peer
                // reached vote timeout and it starts to vote (because it will 
increase
                // the term of the group)
                // To make things simple, refuse the operation and force users 
to
                // invoke transfer_leadership_to after configuration changing is
                // completed so that the peer's configuration is up-to-date 
when it
                // receives the TimeOutNowRequest.
                LOG.warn(
                    "Node {} refused to transfer leadership to peer {} when the 
leader is changing the configuration.",
                    getNodeId(), peer);
                return new Status(RaftError.EBUSY, "Changing the 
configuration");
            }
{noformat}

The current limitation must be investigated.
Seems like the easiest way to fix test is to rewrite it and repeat transfer 
leadership invocation.



 

  was:
ItRaftGroupServiceTest#testTransferLeadership is flaky.

There is a comment in {{NodeImpl#transferLeadershipTo}} about the situation 
when EBUSY:Changing the configuration is thrown:

 
{noformat}
            if (this.confCtx.isBusy()) {
                // It's very messy to deal with the case when the |peer| 
received
                // TimeoutNowRequest and increase the term while somehow 
another leader
                // which was not replicated with the newest configuration has 
been
                // elected. If no add_peer with this very |peer| is to be 
invoked ever
                // after nor this peer is to be killed, this peer will spin in 
the voting
                // procedure and make the each new leader stepped down when the 
peer
                // reached vote timeout and it starts to vote (because it will 
increase
                // the term of the group)
                // To make things simple, refuse the operation and force users 
to
                // invoke transfer_leadership_to after configuration changing is
                // completed so that the peer's configuration is up-to-date 
when it
                // receives the TimeOutNowRequest.
                LOG.warn(
                    "Node {} refused to transfer leadership to peer {} when the 
leader is changing the configuration.",
                    getNodeId(), peer);
                return new Status(RaftError.EBUSY, "Changing the 
configuration");
            }
{noformat}

The current limitation must be investigated.
Seems like the easiest way to fix test is to rewrite it and repeat transfer 
leadership invocation.



 


> ItRaftGroupServiceTest#testTransferLeadership is flaky
> ------------------------------------------------------
>
>                 Key: IGNITE-17064
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17064
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Priority: Blocker
>              Labels: ignite-3
>
> ItRaftGroupServiceTest#testTransferLeadership is flaky with 
> {{java.util.concurrent.ExecutionException: class 
> org.apache.ignite.raft.jraft.rpc.impl.RaftException: EBUSY:Changing the 
> configuration}}
> There is a comment in {{NodeImpl#transferLeadershipTo}} about the situation 
> when EBUSY:Changing the configuration is thrown:
>  
> {noformat}
>             if (this.confCtx.isBusy()) {
>                 // It's very messy to deal with the case when the |peer| 
> received
>                 // TimeoutNowRequest and increase the term while somehow 
> another leader
>                 // which was not replicated with the newest configuration has 
> been
>                 // elected. If no add_peer with this very |peer| is to be 
> invoked ever
>                 // after nor this peer is to be killed, this peer will spin 
> in the voting
>                 // procedure and make the each new leader stepped down when 
> the peer
>                 // reached vote timeout and it starts to vote (because it 
> will increase
>                 // the term of the group)
>                 // To make things simple, refuse the operation and force 
> users to
>                 // invoke transfer_leadership_to after configuration changing 
> is
>                 // completed so that the peer's configuration is up-to-date 
> when it
>                 // receives the TimeOutNowRequest.
>                 LOG.warn(
>                     "Node {} refused to transfer leadership to peer {} when 
> the leader is changing the configuration.",
>                     getNodeId(), peer);
>                 return new Status(RaftError.EBUSY, "Changing the 
> configuration");
>             }
> {noformat}
> The current limitation must be investigated.
> Seems like the easiest way to fix test is to rewrite it and repeat transfer 
> leadership invocation.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to