Mirza Aliev created IGNITE-16559:
------------------------------------

             Summary: Node's log contains "Failed to refresh a leader" 
messages. 
                 Key: IGNITE-16559
                 URL: https://issues.apache.org/jira/browse/IGNITE-16559
             Project: Ignite
          Issue Type: Bug
            Reporter: Mirza Aliev


We noticed that when we run 
{{ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand}} on TC it is 
possible that log contain such messages: 

{noformat}
2022-02-15 12:36:43:568 +0300 
[ERROR][%ItMixedQueriesTest_null_1%Raft-Group-Client-0][RaftGroupServiceImpl] 
Failed to refresh a leader 
[groupId=8e71fc5e-6b24-4b69-ba5a-6eae4c2165cf_part_16]
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
  at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
  at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
  at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
  at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
  at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
  at 
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:502)
  at 
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl$1.lambda$accept$1(RaftGroupServiceImpl.java:544)
  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.concurrent.TimeoutException
{noformat}


Possible root cause: 

Seems, that we get TimeoutException when we try to get a leader from a client 
for a group, for which leader has not been elected yet. If you check the logs, 
you can see, that we get timeout exception and after that leader for the 
corresponding group has been elected. 

Note that we have only one node and 10 partitions for a table in the test, but 
raft leaders are elected sequentially on a node, so electing 10 leaders for 
raft groups on one node might take a little bit longer.  

Possible solution:

Increase timeout for a client to get a leader for the first time.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to