[ 
https://issues.apache.org/jira/browse/IGNITE-20303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20303:
-----------------------------------
    Description: 
If many changes of assignment are happened quickly then rebalance does not have 
time to be completed for each change. In this case exception is thrown:

{code:java}
2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_20000%tableManager-io-10][WatchProcessor]
 Error occurred when processing a watch event
 org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
[consistentId=irdt_ttqr_20000, idx=0]]]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) 
~[main/:?]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
~[main/:?]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
 ~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
 ~[main/:?]
        at 
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
 ~[main/:?]
        at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 ~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]
{code}

The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. 
See exception in the test log:

{code:java}
    @Test
    void testThreeQueuedRebalances() throws Exception {
        Node node = getNode(0);

        createZone(node, ZONE_NAME, 1, 1);

        createTable(node, ZONE_NAME, TABLE_NAME);

        assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
0).size() == 1, AWAIT_TIMEOUT_MILLIS));

        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);

        waitPartitionAssignmentsSyncedToExpected(0, 2);

        checkPartitionNodes(0, 2);
    }
{code}

We can fix it by a check if the raft node and the Replica are created before 
startPartitionRaftGroupNode and startReplicaWithNewListener in 
TableManager#handleChangePendingAssignmentEvent.

  was:
If many changes of assignment are happened quickly then rebalance does not have 
time to be completed for each change. In this case exception is thrown:

{code:java}
2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_20000%tableManager-io-10][WatchProcessor]
 Error occurred when processing a watch event
 org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
[consistentId=irdt_ttqr_20000, idx=0]]]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) 
~[main/:?]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
~[main/:?]
        at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
 ~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
 ~[main/:?]
        at 
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
~[main/:?]
        at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
 ~[main/:?]
        at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 ~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]
{code}

The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. 
See exception in the test log:

{code:java}
    @Test
    void testThreeQueuedRebalances() throws Exception {
        Node node = getNode(0);

        createZone(node, ZONE_NAME, 1, 1);

        createTable(node, ZONE_NAME, TABLE_NAME);

        assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
0).size() == 1, AWAIT_TIMEOUT_MILLIS));

        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);
        alterZone(node, ZONE_NAME, 3);
        alterZone(node, ZONE_NAME, 2);

        waitPartitionAssignmentsSyncedToExpected(0, 2);

        checkPartitionNodes(0, 2);
    }
{code}




> "Raft group on the node is already started" exception when pending and 
> planned assignment changed faster then rebalance
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-20303
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20303
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Priority: Major
>              Labels: ignite-3
>
> If many changes of assignment are happened quickly then rebalance does not 
> have time to be completed for each change. In this case exception is thrown:
> {code:java}
> 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_20000%tableManager-io-10][WatchProcessor]
>  Error occurred when processing a watch event
>  org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
> already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
> [consistentId=irdt_ttqr_20000, idx=0]]]
>       at 
> org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342)
>  ~[main/:?]
>       at 
> org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
> ~[main/:?]
>       at 
> org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
> ~[main/:?]
>       at 
> org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
>  ~[main/:?]
>       at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
>  ~[main/:?]
>       at 
> org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
> ~[main/:?]
>       at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
>  ~[main/:?]
>       at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
>  ~[?:?]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  ~[?:?]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  ~[?:?]
>       at java.lang.Thread.run(Thread.java:834) ~[?:?]
> {code}
> The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. 
> See exception in the test log:
> {code:java}
>     @Test
>     void testThreeQueuedRebalances() throws Exception {
>         Node node = getNode(0);
>         createZone(node, ZONE_NAME, 1, 1);
>         createTable(node, ZONE_NAME, TABLE_NAME);
>         assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
> 0).size() == 1, AWAIT_TIMEOUT_MILLIS));
>         alterZone(node, ZONE_NAME, 2);
>         alterZone(node, ZONE_NAME, 3);
>         alterZone(node, ZONE_NAME, 2);
>         alterZone(node, ZONE_NAME, 3);
>         alterZone(node, ZONE_NAME, 2);
>         alterZone(node, ZONE_NAME, 3);
>         alterZone(node, ZONE_NAME, 2);
>         alterZone(node, ZONE_NAME, 3);
>         alterZone(node, ZONE_NAME, 2);
>         alterZone(node, ZONE_NAME, 3);
>         alterZone(node, ZONE_NAME, 2);
>         waitPartitionAssignmentsSyncedToExpected(0, 2);
>         checkPartitionNodes(0, 2);
>     }
> {code}
> We can fix it by a check if the raft node and the Replica are created before 
> startPartitionRaftGroupNode and startReplicaWithNewListener in 
> TableManager#handleChangePendingAssignmentEvent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to