Iurii Gerzhedovich created IGNITE-26713:
-------------------------------------------
Summary: Failed to change peers
Key: IGNITE-26713
URL: https://issues.apache.org/jira/browse/IGNITE-26713
Project: Ignite
Issue Type: Improvement
Components: raft ai3
Reporter: Iurii Gerzhedovich
Cluster is down due to `Critical system error detected. Will be handled
accordingly to configured handler`
[TC
build|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/9548864?logFilter=debug&logView=flowAware]
{code:java}
13:37:04 [2025-10-14T05:37:04,055][WARN
][%imgdrt_ripwmmio_3344%Raft-Group-Client-7][PartitionReplicaLifecycleManager]
Failed to change peers [grp=0_part_5].
13:37:04
13:37:04 java.util.concurrent.CompletionException:
java.util.concurrent.TimeoutException: Send with retry timed out [retryCount =
150, groupId = metastorage_group, traceId = null, request =
org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null,
retryReasons = [[time=1760438219047, msg=Peer imgdrt_ripwmmio_3345:0 threw
PeerUnavailableException; attemptWaitDuration=197, attemptDuration=3,
attemptStartTime=2025-10-14T05:36:59,047], [time=1760438219247, msg=Peer
imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
attemptWaitDuration=197, attemptDuration=3,
attemptStartTime=2025-10-14T05:36:59,247], [time=1760438219447, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=197,
attemptDuration=3, attemptStartTime=2025-10-14T05:36:59,447],
[time=1760438219648, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=197, attemptDuration=4,
attemptStartTime=2025-10-14T05:36:59,648], [time=1760438219848, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=196,
attemptDuration=4, attemptStartTime=2025-10-14T05:36:59,848],
[time=1760438220048, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=196, attemptDuration=4,
attemptStartTime=2025-10-14T05:37:00,048], [time=1760438220248, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=196,
attemptDuration=4, attemptStartTime=2025-10-14T05:37:00,248],
[time=1760438220449, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=196, attemptDuration=5,
attemptStartTime=2025-10-14T05:37:00,449], [time=1760438220649, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=195,
attemptDuration=5, attemptStartTime=2025-10-14T05:37:00,649],
[time=1760438220849, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=195, attemptDuration=5,
attemptStartTime=2025-10-14T05:37:00,849], [time=1760438221050, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=195,
attemptDuration=6, attemptStartTime=2025-10-14T05:37:01,050],
[time=1760438221250, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=194, attemptDuration=6,
attemptStartTime=2025-10-14T05:37:01,250], [time=1760438221450, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=194,
attemptDuration=6, attemptStartTime=2025-10-14T05:37:01,450],
[time=1760438221650, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=194, attemptDuration=6,
attemptStartTime=2025-10-14T05:37:01,650], [time=1760438221851, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=194,
attemptDuration=7, attemptStartTime=2025-10-14T05:37:01,851],
[time=1760438222051, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=193, attemptDuration=7,
attemptStartTime=2025-10-14T05:37:02,051], [time=1760438222251, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=193,
attemptDuration=7, attemptStartTime=2025-10-14T05:37:02,251],
[time=1760438222451, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=193, attemptDuration=7,
attemptStartTime=2025-10-14T05:37:02,451], [time=1760438222652, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=193,
attemptDuration=8, attemptStartTime=2025-10-14T05:37:02,652],
[time=1760438222852, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=192, attemptDuration=8,
attemptStartTime=2025-10-14T05:37:02,852], [time=1760438223052, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=192,
attemptDuration=8, attemptStartTime=2025-10-14T05:37:03,052],
[time=1760438223253, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=192, attemptDuration=9,
attemptStartTime=2025-10-14T05:37:03,253], [time=1760438223453, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=191,
attemptDuration=9, attemptStartTime=2025-10-14T05:37:03,453],
[time=1760438223653, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=191, attemptDuration=9,
attemptStartTime=2025-10-14T05:37:03,653], [time=1760438223853, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=191,
attemptDuration=9, attemptStartTime=2025-10-14T05:37:03,853]], stopTime =
1760438224017, currentTime = 1760438224054, startTime = 1760438194017, duration
= 30037].
13:37:04 at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
13:37:04 at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
13:37:04 at
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
13:37:04 at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
13:37:04 at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
13:37:04 at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:686)
13:37:04 at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:660)
13:37:04 at
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$51(RaftGroupServiceImpl.java:910)
13:37:04 at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
13:37:04 at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
13:37:04 at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
13:37:04 at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
13:37:04 at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
13:37:04 at java.base/java.lang.Thread.run(Thread.java:833)
13:37:04 Caused by: java.util.concurrent.TimeoutException: Send with retry
timed out [retryCount = 150, groupId = metastorage_group, traceId = null,
request = org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand
= null, retryReasons = [[time=1760438219047, msg=Peer imgdrt_ripwmmio_3345:0
threw PeerUnavailableException; attemptWaitDuration=197, attemptDuration=3,
attemptStartTime=2025-10-14T05:36:59,047], [time=1760438219247, msg=Peer
imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
attemptWaitDuration=197, attemptDuration=3,
attemptStartTime=2025-10-14T05:36:59,247], [time=1760438219447, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=197,
attemptDuration=3, attemptStartTime=2025-10-14T05:36:59,447],
[time=1760438219648, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=197, attemptDuration=4,
attemptStartTime=2025-10-14T05:36:59,648], [time=1760438219848, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=196,
attemptDuration=4, attemptStartTime=2025-10-14T05:36:59,848],
[time=1760438220048, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=196, attemptDuration=4,
attemptStartTime=2025-10-14T05:37:00,048], [time=1760438220248, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=196,
attemptDuration=4, attemptStartTime=2025-10-14T05:37:00,248],
[time=1760438220449, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=196, attemptDuration=5,
attemptStartTime=2025-10-14T05:37:00,449], [time=1760438220649, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=195,
attemptDuration=5, attemptStartTime=2025-10-14T05:37:00,649],
[time=1760438220849, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=195, attemptDuration=5,
attemptStartTime=2025-10-14T05:37:00,849], [time=1760438221050, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=195,
attemptDuration=6, attemptStartTime=2025-10-14T05:37:01,050],
[time=1760438221250, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=194, attemptDuration=6,
attemptStartTime=2025-10-14T05:37:01,250], [time=1760438221450, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=194,
attemptDuration=6, attemptStartTime=2025-10-14T05:37:01,450],
[time=1760438221650, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=194, attemptDuration=6,
attemptStartTime=2025-10-14T05:37:01,650], [time=1760438221851, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=194,
attemptDuration=7, attemptStartTime=2025-10-14T05:37:01,851],
[time=1760438222051, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=193, attemptDuration=7,
attemptStartTime=2025-10-14T05:37:02,051], [time=1760438222251, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=193,
attemptDuration=7, attemptStartTime=2025-10-14T05:37:02,251],
[time=1760438222451, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=193, attemptDuration=7,
attemptStartTime=2025-10-14T05:37:02,451], [time=1760438222652, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=193,
attemptDuration=8, attemptStartTime=2025-10-14T05:37:02,652],
[time=1760438222852, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=192, attemptDuration=8,
attemptStartTime=2025-10-14T05:37:02,852], [time=1760438223052, msg=Peer
imgdrt_ripwmmio_3346:0 threw PeerUnavailableException; attemptWaitDuration=192,
attemptDuration=8, attemptStartTime=2025-10-14T05:37:03,052],
[time=1760438223253, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=192, attemptDuration=9,
attemptStartTime=2025-10-14T05:37:03,253], [time=1760438223453, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=191,
attemptDuration=9, attemptStartTime=2025-10-14T05:37:03,453],
[time=1760438223653, msg=Peer imgdrt_ripwmmio_3344:0 returned code EPERM: Is
not leader.; attemptWaitDuration=191, attemptDuration=9,
attemptStartTime=2025-10-14T05:37:03,653], [time=1760438223853, msg=Peer
imgdrt_ripwmmio_3345:0 threw PeerUnavailableException; attemptWaitDuration=191,
attemptDuration=9, attemptStartTime=2025-10-14T05:37:03,853]], stopTime =
1760438224017, currentTime = 1760438224054, startTime = 1760438194017, duration
= 30037].
13:37:04 at
org.apache.ignite.internal.raft.RetryContext.createTimeoutException(RetryContext.java:206)
13:37:04 ... 9 more
13:37:04 [2025-10-14T05:37:04,055][WARN
][%imgdrt_ripwmmio_3344%Raft-Group-Client-4][PartitionReplicaLifecycleManager]
Failed to change peers [grp=0_part_8].
13:37:04 {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)