Alexander Lapin created IGNITE-21441:
----------------------------------------
Summary: ItSchemaChangeTableViewTest#testAddNewColumn is flaky
with Replication is timed out [replicaGrpId=6_part_5]
Key: IGNITE-21441
URL: https://issues.apache.org/jira/browse/IGNITE-21441
Project: Ignite
Issue Type: Bug
Reporter: Alexander Lapin
Similar to IGNITE-21394 but with CancellationException as a root cause instead
of TimeoutException.
{code:java}
Replication is timed out [replicaGrpId=6_part_5]
org.apache.ignite.tx.TransactionException: IGN-REP-3
TraceId:47cb7cb4-3e8d-40ce-8a2f-55d13bb2c798 Replication is timed out
[replicaGrpId=6_part_5] {code}
Possible root cause
{code:java}
[2024-02-02T09:47:03,851][ERROR][%isctvt_tanc_3346%Raft-Group-Client-11][WatchProcessor]
Error occurred when processing a watch event
org.apache.ignite.internal.lang.IgniteInternalException: Failed to get a
leader for the RAFT replication group [get=6_part_0].
at
org.apache.ignite.internal.table.distributed.TableManager.lambda$changePeersOnRebalance$96(TableManager.java:1844)
~[ignite-table-3.0.0-SNAPSHOT.jar:?]
at
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
~[?:?]
at
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
~[?:?]
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
~[?:?]
at
java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2398)
~[?:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:543)
~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$41(RaftGroupServiceImpl.java:605)
~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.util.concurrent.CompletionException:
java.util.concurrent.CancellationException
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
~[?:?]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
~[?:?]
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
~[?:?]
... 10 more
Caused by: java.util.concurrent.CancellationException
at
java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
~[?:?]
... 8 more
[2024-02-02T09:47:03,852][WARN
][%isctvt_tanc_3346%Raft-Group-Client-11][TableManager] Unable to process
pending assignments event
org.apache.ignite.internal.lang.IgniteInternalException: Failed to get a
leader for the RAFT replication group [get=6_part_0]. {code}
[https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820437?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&showLog=7820408_2572_91.2439.2498&logFilter=debug&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandCode+Inspection=true&logView=flowAware]
Failed locally 1 out of 100.
h3. Implementation Notes
Seems that we should cover not only TimeoutException while retrieveing leader
within watch event processing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)