[
https://issues.apache.org/jira/browse/IGNITE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov updated IGNITE-20640:
---------------------------------------
Description:
This behavior leads to getting stuck in any RAFT operation because the leader
cannot be elected. This issue is reproduced in the test
ItDataSchemaSyncTest#checkSchemasCorrectlyRestore, to test it in a log just add
an assertion:
{code:title=Loza#startRaftGroupNodeInternal}
assert configuration.peers().contains(nodeId.peer()) || configuration.learners()
.contains(nodeId.peer()) : "Raft node started on a peer where
it should not be";
{code}
{noformat}
[2023-10-10T20:51:51,154][ERROR][%node0%tableManager-io-11][WatchProcessor]
Error occurred when processing a watch event
java.lang.AssertionError: Raft node started on a peer where it should not be
at
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:361)
~[main/:?]
at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:252)
~[main/:?]
at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:225)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:1986)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$90(TableManager.java:1878)
~[main/:?]
at
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:805)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$91(TableManager.java:1848)
~[main/:?]
at
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
[?:?]
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
{noformat}
was:
This behavior leads to getting stuck in any RAFT operation because the leader
cannot be elected. This issue is reproduced in the test
ItDataSchemaSyncTest#checkSchemasCorrectlyRestore, to test it in a log just add
an assertion:
{code:title=Loza#startRaftGroupNodeInternal}
assert configuration.peers().contains(nodeId.peer()) || configuration.learners()
.contains(nodeId.peer()) : "Raft node started on a peer where
it should not be";
{code}
{noformat}
[2023-10-10T20:51:51,154][ERROR][%node0%tableManager-io-11][WatchProcessor]
Error occurred when processing a watch event
java.lang.AssertionError: Raft node started on a peer where it should not be
at
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:361)
~[main/:?]
at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:252)
~[main/:?]
at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:225)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:1986)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$90(TableManager.java:1878)
~[main/:?]
at
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:805)
~[main/:?]
at
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$91(TableManager.java:1848)
~[main/:?]
at
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
[?:?]
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
{noformat}
> Raft node started in a node where it should not be
> --------------------------------------------------
>
> Key: IGNITE-20640
> URL: https://issues.apache.org/jira/browse/IGNITE-20640
> Project: Ignite
> Issue Type: Bug
> Reporter: Vladislav Pyatkov
> Priority: Major
>
> This behavior leads to getting stuck in any RAFT operation because the leader
> cannot be elected. This issue is reproduced in the test
> ItDataSchemaSyncTest#checkSchemasCorrectlyRestore, to test it in a log just
> add an assertion:
> {code:title=Loza#startRaftGroupNodeInternal}
> assert configuration.peers().contains(nodeId.peer()) ||
> configuration.learners()
> .contains(nodeId.peer()) : "Raft node started on a peer where
> it should not be";
> {code}
> {noformat}
> [2023-10-10T20:51:51,154][ERROR][%node0%tableManager-io-11][WatchProcessor]
> Error occurred when processing a watch event
> java.lang.AssertionError: Raft node started on a peer where it should not be
> at
> org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:361)
> ~[main/:?]
> at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:252)
> ~[main/:?]
> at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:225)
> ~[main/:?]
> at
> org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:1986)
> ~[main/:?]
> at
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$90(TableManager.java:1878)
> ~[main/:?]
> at
> org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:805)
> ~[main/:?]
> at
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$91(TableManager.java:1848)
> ~[main/:?]
> at
> java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
> [?:?]
> at
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
> [?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)