Mirza Aliev created IGNITE-16559:
------------------------------------
Summary: Node's log contains "Failed to refresh a leader"
messages.
Key: IGNITE-16559
URL: https://issues.apache.org/jira/browse/IGNITE-16559
Project: Ignite
Issue Type: Bug
Reporter: Mirza Aliev
We noticed that when we run
{{ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand}} on TC it is
possible that log contain such messages:
{noformat}
2022-02-15 12:36:43:568 +0300
[ERROR][%ItMixedQueriesTest_null_1%Raft-Group-Client-0][RaftGroupServiceImpl]
Failed to refresh a leader
[groupId=8e71fc5e-6b24-4b69-ba5a-6eae4c2165cf_part_16]
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
at
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:502)
at
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl$1.lambda$accept$1(RaftGroupServiceImpl.java:544)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.concurrent.TimeoutException
{noformat}
Possible root cause:
Seems, that we get TimeoutException when we try to get a leader from a client
for a group, for which leader has not been elected yet. If you check the logs,
you can see, that we get timeout exception and after that leader for the
corresponding group has been elected.
Note that we have only one node and 10 partitions for a table in the test, but
raft leaders are elected sequentially on a node, so electing 10 leaders for
raft groups on one node might take a little bit longer.
Possible solution:
Increase timeout for a client to get a leader for the first time.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)