[ 
https://issues.apache.org/jira/browse/IGNITE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Gusakov reassigned IGNITE-20640:
---------------------------------------

    Assignee: Kirill Gusakov

> RAFT client does not change peers after rebalance
> -------------------------------------------------
>
>                 Key: IGNITE-20640
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20640
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vladislav Pyatkov
>            Assignee: Kirill Gusakov
>            Priority: Blocker
>
> *Motivation*
> Due to nodes starting simultaneously, tests may have several rebalances at 
> the start. After the rebalance is finished, the list of peers for the raft 
> replication group can be different. The changed list of peers should apply to 
> RAFT clients, but it does not happen.
> The method (InternalTableImpl#updateInternalTableRaftGroupService) updates 
> clients only on table start and does not consider a further rebalance. 
> Currently, we try to send a raft command but receive a timeout exception 
> because the leader is absent from the list of peers:
> {noformat}
> [2023-10-13T15:16:18,120][INFO ][%node1%tableManager-io-13][Loza] Start new 
> raft node=RaftNodeId [groupId=3_part_12, peer=Peer [consistentId=node1, 
> idx=0]] with initial configuration=PeersAndLearners [peers=Set12 [Peer 
> [consistentId=node1, idx=0]], learners=SetN []]
> [2023-10-13T15:16:18,472][INFO ][%node2%tableManager-io-14][Loza] Start new 
> raft node=RaftNodeId [groupId=3_part_12, peer=Peer [consistentId=node2, 
> idx=0]] with initial configuration=PeersAndLearners [peers=Set12 [Peer 
> [consistentId=node1, idx=0]], learners=SetN []]
> ...
> [2023-10-13T15:16:18,661][ERROR][%node1%JRaft-Request-Processor-21][RpcRequestProcessor]
>  handleRequest ChangePeersAsyncRequestImpl [groupId=3_part_12, 
> leaderId=node1, newLearnersList=ArrayList [], newPeersList=ArrayList [node2], 
> term=2] failed
> java.lang.IllegalStateException: Not leader
>         at 
> org.apache.ignite.raft.jraft.core.NodeImpl.listPeers(NodeImpl.java:3293) 
> ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.impl.cli.ChangePeersAsyncRequestProcessor.processRequest0(ChangePeersAsyncRequestProcessor.java:55)
>  ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.impl.cli.ChangePeersAsyncRequestProcessor.processRequest0(ChangePeersAsyncRequestProcessor.java:36)
>  ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.impl.cli.BaseCliRequestProcessor.processRequest(BaseCliRequestProcessor.java:112)
>  ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:49)
>  ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:29)
>  ~[main/:?]
>         at 
> org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.lambda$onReceived$0(IgniteRpcServer.java:194)
>  ~[main/:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>         at java.lang.Thread.run(Thread.java:834) [?:?]
> {noformat}
> *Difinition of done*
> After rebalancing the non-interaction peer list, the list should be updated 
> and RAFT commands applied.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to