[jira] [Updated] (IGNITE-27921) ItDisasterRecoveryReconfigurationTest may leave stale nodes

Vyacheslav Koptilin (Jira) Thu, 19 Feb 2026 04:27:01 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-27921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vyacheslav Koptilin updated IGNITE-27921:
-----------------------------------------
    Description: 
When the test fails to start all required nodes due to a timeout, for example, 
it may leave stale nodes after that:

{noformat}
  java.lang.AssertionError: Race operations took too long.
  java.lang.AssertionError: Race operations took too long.
    at 
org.apache.ignite.internal.testframework.IgniteTestUtils.createAssertionError(IgniteTestUtils.java:936)
    at 
org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:927)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.startNodesInParallel(ItDisasterRecoveryReconfigurationTest.java:2010)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.setUp(ItDisasterRecoveryReconfigurationTest.java:188)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
  Caused by: java.lang.InterruptedException
    at 
org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:919)
{noformat}

The main idea is that `runRace` (which is used here to start nodes)
{noformat}
private void startNodesInParallel(int... nodeIndexes) {
        runRace(20_000, IntStream.of(nodeIndexes).<RunnableX>mapToObj(i -> () 
-> cluster.startNode(i)).toArray(RunnableX[]::new));
    }
{noformat}
doesn't wait for finishing all internal threads before failing its own 
execution. So, it is possible that the shutdown procedure might not see all 
nodes to stop.

> ItDisasterRecoveryReconfigurationTest may leave stale nodes
> -----------------------------------------------------------
>
>                 Key: IGNITE-27921
>                 URL: https://issues.apache.org/jira/browse/IGNITE-27921
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain, ignite-3
>
> When the test fails to start all required nodes due to a timeout, for 
> example, it may leave stale nodes after that:
> {noformat}
>   java.lang.AssertionError: Race operations took too long.
>   java.lang.AssertionError: Race operations took too long.
>     at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.createAssertionError(IgniteTestUtils.java:936)
>     at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:927)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.startNodesInParallel(ItDisasterRecoveryReconfigurationTest.java:2010)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.setUp(ItDisasterRecoveryReconfigurationTest.java:188)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
>   Caused by: java.lang.InterruptedException
>     at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:919)
> {noformat}
> The main idea is that `runRace` (which is used here to start nodes)
> {noformat}
> private void startNodesInParallel(int... nodeIndexes) {
>         runRace(20_000, IntStream.of(nodeIndexes).<RunnableX>mapToObj(i -> () 
> -> cluster.startNode(i)).toArray(RunnableX[]::new));
>     }
> {noformat}
> doesn't wait for finishing all internal threads before failing its own 
> execution. So, it is possible that the shutdown procedure might not see all 
> nodes to stop.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-27921) ItDisasterRecoveryReconfigurationTest may leave stale nodes

Reply via email to