[ 
https://issues.apache.org/jira/browse/IGNITE-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748619#comment-17748619
 ] 

Alexander Lapin commented on IGNITE-19095:
------------------------------------------

There's a big chance that given issue will be fixed automatically when we will 
switch from current raft client to topology aware one.

> Cyclic retry of ActionRequest in RaftGroupServiceImpl
> -----------------------------------------------------
>
>                 Key: IGNITE-19095
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19095
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Konstantin Orlov
>            Assignee: Alexander Lapin
>            Priority: Critical
>              Labels: ignite-3
>         Attachments: log_pollution.txt
>
>
> Please take a look at the following snippet:
> {code:java}
> private void handleThrowable(
>            ...
>     ) {
>         if (recoverable(err)) {
>             ...
>             scheduleRetry(() -> sendWithRetry(randomNode(peer), 
> requestFactory, stopTime, fut));
>         } else {
>             fut.completeExceptionally(err);
>         }
>     }
> {code}
> In case of a recoverable error, the request will be sent once again. But if 2 
> out of 3 nodes had already been stopped, this retry logic will stuck in an 
> infinite loop. The reason is that ConnectException is considered recoverable, 
> and we are choosing another node keeping in mind only the node that had 
> failed during current iteration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to