[
https://issues.apache.org/jira/browse/IGNITE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Gusakov reassigned IGNITE-20640:
---------------------------------------
Assignee: Kirill Gusakov
> RAFT client does not change peers after rebalance
> -------------------------------------------------
>
> Key: IGNITE-20640
> URL: https://issues.apache.org/jira/browse/IGNITE-20640
> Project: Ignite
> Issue Type: Bug
> Reporter: Vladislav Pyatkov
> Assignee: Kirill Gusakov
> Priority: Blocker
>
> *Motivation*
> Due to nodes starting simultaneously, tests may have several rebalances at
> the start. After the rebalance is finished, the list of peers for the raft
> replication group can be different. The changed list of peers should apply to
> RAFT clients, but it does not happen.
> The method (InternalTableImpl#updateInternalTableRaftGroupService) updates
> clients only on table start and does not consider a further rebalance.
> Currently, we try to send a raft command but receive a timeout exception
> because the leader is absent from the list of peers:
> {noformat}
> [2023-10-13T15:16:18,120][INFO ][%node1%tableManager-io-13][Loza] Start new
> raft node=RaftNodeId [groupId=3_part_12, peer=Peer [consistentId=node1,
> idx=0]] with initial configuration=PeersAndLearners [peers=Set12 [Peer
> [consistentId=node1, idx=0]], learners=SetN []]
> [2023-10-13T15:16:18,472][INFO ][%node2%tableManager-io-14][Loza] Start new
> raft node=RaftNodeId [groupId=3_part_12, peer=Peer [consistentId=node2,
> idx=0]] with initial configuration=PeersAndLearners [peers=Set12 [Peer
> [consistentId=node1, idx=0]], learners=SetN []]
> ...
> [2023-10-13T15:16:18,661][ERROR][%node1%JRaft-Request-Processor-21][RpcRequestProcessor]
> handleRequest ChangePeersAsyncRequestImpl [groupId=3_part_12,
> leaderId=node1, newLearnersList=ArrayList [], newPeersList=ArrayList [node2],
> term=2] failed
> java.lang.IllegalStateException: Not leader
> at
> org.apache.ignite.raft.jraft.core.NodeImpl.listPeers(NodeImpl.java:3293)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.impl.cli.ChangePeersAsyncRequestProcessor.processRequest0(ChangePeersAsyncRequestProcessor.java:55)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.impl.cli.ChangePeersAsyncRequestProcessor.processRequest0(ChangePeersAsyncRequestProcessor.java:36)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.impl.cli.BaseCliRequestProcessor.processRequest(BaseCliRequestProcessor.java:112)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:49)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:29)
> ~[main/:?]
> at
> org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.lambda$onReceived$0(IgniteRpcServer.java:194)
> ~[main/:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> {noformat}
> *Difinition of done*
> After rebalancing the non-interaction peer list, the list should be updated
> and RAFT commands applied.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)