[
https://issues.apache.org/jira/browse/IGNITE-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-26713:
-------------------------------------
Priority: Minor (was: Major)
> Failed to change peers
> ----------------------
>
> Key: IGNITE-26713
> URL: https://issues.apache.org/jira/browse/IGNITE-26713
> Project: Ignite
> Issue Type: Improvement
> Components: raft ai3
> Reporter: Iurii Gerzhedovich
> Assignee: Alexander Lapin
> Priority: Minor
> Labels: MakeTeamcityGreenAgain, ignite-3
>
> Cluster is down due to `Critical system error detected. Will be handled
> accordingly to configured handler`
> [TC
> build|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/9548864?logFilter=debug&logView=flowAware]
>
>
> {code:java}
> 13:37:04 [2025-10-14T05:37:04,055][WARN
> ][%imgdrt_ripwmmio_3344%Raft-Group-Client-7][PartitionReplicaLifecycleManager]
> Failed to change peers [grp=0_part_5].
> 13:37:04
> 13:37:04 java.util.concurrent.CompletionException:
> java.util.concurrent.TimeoutException: Send with retry timed out [retryCount
> = 150, groupId = metastorage_group, traceId = null, request =
> org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl, originCommand = null,
> retryReasons = [[time=1760438219047, msg=Peer imgdrt_ripwmmio_3345:0 threw
> PeerUnavailableException; attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,047], [time=1760438219247, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,247], [time=1760438219447, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,447], [time=1760438219648, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=197, attemptDuration=4,
> attemptStartTime=2025-10-14T05:36:59,648], [time=1760438219848, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:36:59,848], [time=1760438220048, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:37:00,048], [time=1760438220248, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:37:00,248], [time=1760438220449, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=196, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,449], [time=1760438220649, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=195, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,649], [time=1760438220849, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=195, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,849], [time=1760438221050, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=195, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,050], [time=1760438221250, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,250], [time=1760438221450, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,450], [time=1760438221650, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,650], [time=1760438221851, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=194, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:01,851], [time=1760438222051, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,051], [time=1760438222251, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,251], [time=1760438222451, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,451], [time=1760438222652, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=193, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:02,652], [time=1760438222852, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=192, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:02,852], [time=1760438223052, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=192, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:03,052], [time=1760438223253, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=192, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,253], [time=1760438223453, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,453], [time=1760438223653, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,653], [time=1760438223853, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,853]], stopTime = 1760438224017,
> currentTime = 1760438224054, startTime = 1760438194017, duration = 30037].
> 13:37:04 at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
> 13:37:04 at
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
> 13:37:04 at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
> 13:37:04 at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> 13:37:04 at
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
> 13:37:04 at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:686)
> 13:37:04 at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:660)
> 13:37:04 at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$scheduleRetry$51(RaftGroupServiceImpl.java:910)
> 13:37:04 at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> 13:37:04 at
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> 13:37:04 at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> 13:37:04 at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> 13:37:04 at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> 13:37:04 at java.base/java.lang.Thread.run(Thread.java:833)
> 13:37:04 Caused by: java.util.concurrent.TimeoutException: Send with
> retry timed out [retryCount = 150, groupId = metastorage_group, traceId =
> null, request = org.apache.ignite.raft.jraft.rpc.ReadActionRequestImpl,
> originCommand = null, retryReasons = [[time=1760438219047, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,047], [time=1760438219247, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,247], [time=1760438219447, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=197, attemptDuration=3,
> attemptStartTime=2025-10-14T05:36:59,447], [time=1760438219648, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=197, attemptDuration=4,
> attemptStartTime=2025-10-14T05:36:59,648], [time=1760438219848, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:36:59,848], [time=1760438220048, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:37:00,048], [time=1760438220248, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=196, attemptDuration=4,
> attemptStartTime=2025-10-14T05:37:00,248], [time=1760438220449, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=196, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,449], [time=1760438220649, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=195, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,649], [time=1760438220849, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=195, attemptDuration=5,
> attemptStartTime=2025-10-14T05:37:00,849], [time=1760438221050, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=195, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,050], [time=1760438221250, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,250], [time=1760438221450, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,450], [time=1760438221650, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=194, attemptDuration=6,
> attemptStartTime=2025-10-14T05:37:01,650], [time=1760438221851, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=194, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:01,851], [time=1760438222051, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,051], [time=1760438222251, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,251], [time=1760438222451, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=193, attemptDuration=7,
> attemptStartTime=2025-10-14T05:37:02,451], [time=1760438222652, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=193, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:02,652], [time=1760438222852, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=192, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:02,852], [time=1760438223052, msg=Peer
> imgdrt_ripwmmio_3346:0 threw PeerUnavailableException;
> attemptWaitDuration=192, attemptDuration=8,
> attemptStartTime=2025-10-14T05:37:03,052], [time=1760438223253, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=192, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,253], [time=1760438223453, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,453], [time=1760438223653, msg=Peer
> imgdrt_ripwmmio_3344:0 returned code EPERM: Is not leader.;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,653], [time=1760438223853, msg=Peer
> imgdrt_ripwmmio_3345:0 threw PeerUnavailableException;
> attemptWaitDuration=191, attemptDuration=9,
> attemptStartTime=2025-10-14T05:37:03,853]], stopTime = 1760438224017,
> currentTime = 1760438224054, startTime = 1760438194017, duration = 30037].
> 13:37:04 at
> org.apache.ignite.internal.raft.RetryContext.createTimeoutException(RetryContext.java:206)
> 13:37:04 ... 9 more
> 13:37:04 [2025-10-14T05:37:04,055][WARN
> ][%imgdrt_ripwmmio_3344%Raft-Group-Client-4][PartitionReplicaLifecycleManager]
> Failed to change peers [grp=0_part_8].
> 13:37:04 {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)