[
https://issues.apache.org/jira/browse/IGNITE-24622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov updated IGNITE-24622:
---------------------------------------
Description:
h3. Motivation
Request for distribution MS is a long and havy operation to do it for each RAFT
group.
{noformat}
[2025-02-24T17:38:09,004][ERROR][%node_3346%Raft-Group-Client-4][ReplicaManager]
Couldn't fetch pending assignments for rebalance failover [groupId=319_part_6,
term=3].
java.util.concurrent.CompletionException:
java.util.concurrent.TimeoutException: Send with retry timed out [retryCount =
25, groupId = metastorage_group, traceId =
90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
~[?:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:589)
~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$50(RaftGroupServiceImpl.java:774)
~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
~[?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
~[?:?]
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.util.concurrent.TimeoutException: Send with retry timed out
[retryCount = 25, groupId = metastorage_group, traceId =
90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
... 8 more
{noformat}
h3. Definition of done
Reduce the quantity of distribution requests or even ask MS only locally.
was:
h3. Motivation
Request for distribution MS is a long and havy operation to do it for each RAFT
group.
{noformat}
[2025-02-24T17:38:09,004][ERROR][%node_3346%Raft-Group-Client-4][ReplicaManager]
Couldn't fetch pending assignments for rebalance failover [groupId=319_part_6,
term=3].
java.util.concurrent.CompletionException:
java.util.concurrent.TimeoutException: Send with retry timed out [retryCount =
25, groupId = metastorage_group, traceId =
90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
~[?:?]
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
~[?:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:589)
~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$50(RaftGroupServiceImpl.java:774)
~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
~[?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
~[?:?]
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.util.concurrent.TimeoutException: Send with retry timed out
[retryCount = 25, groupId = metastorage_group, traceId =
90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
... 8 more
{noformat}
> Replica requests distributed assignments after each leader change
> -----------------------------------------------------------------
>
> Key: IGNITE-24622
> URL: https://issues.apache.org/jira/browse/IGNITE-24622
> Project: Ignite
> Issue Type: Improvement
> Reporter: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> Request for distribution MS is a long and havy operation to do it for each
> RAFT group.
> {noformat}
> [2025-02-24T17:38:09,004][ERROR][%node_3346%Raft-Group-Client-4][ReplicaManager]
> Couldn't fetch pending assignments for rebalance failover
> [groupId=319_part_6, term=3].
> java.util.concurrent.CompletionException:
> java.util.concurrent.TimeoutException: Send with retry timed out [retryCount
> = 25, groupId = metastorage_group, traceId =
> 90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
> org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
> at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
> ~[?:?]
> at
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
> ~[?:?]
> at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
> ~[?:?]
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> ~[?:?]
> at
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
> ~[?:?]
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:589)
> ~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$50(RaftGroupServiceImpl.java:774)
> ~[ignite-raft-3.1.0-SNAPSHOT.jar:?]
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
> ~[?:?]
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
> ~[?:?]
> at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ~[?:?]
> at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
> Caused by: java.util.concurrent.TimeoutException: Send with retry timed out
> [retryCount = 25, groupId = metastorage_group, traceId =
> 90ec2c7b-9cbb-42d0-bf79-5b374aa28445, request =
> org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null].
> ... 8 more
> {noformat}
> h3. Definition of done
> Reduce the quantity of distribution requests or even ask MS only locally.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)