[ 
https://issues.apache.org/jira/browse/IGNITE-28084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-28084:
--------------------------------------
    Description: 
In a very rare case the test {{outdatedNodeLeftEventDoesntCauseQueryToHang}} 
hangs and fails with the following error:
{noformat}
java.util.concurrent.TimeoutException
        at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
        at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
        at 
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
        at 
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
        at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
{noformat}

The output also includes diagnostic information:
{noformat}
[2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution 
service could not be stopped within the specified timeout (5000 ms).Debug info 
for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1, 
executionToken=0] (canceled=true, stopped=false)
  Coordinator node: node_1 (current node)
  Root node state: completed exceptionally 
(org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The 
query was cancelled while executing. TraceId:ad076434)

  Fragments awaiting init completion:
    id=1, node=node_1
    id=2, node=node_1
    id=2, node=node_2
    id=2, node=node_3



Dumping threads:

"main" prio=5 Id=1 RUNNABLE
        at 
[email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
        at 
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
        at 
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
 Source)
        at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
        ...


"Reference Handler" daemon prio=10 Id=9 RUNNABLE
        at 
[email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native 
Method)
        at 
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
        at 
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)


"Finalizer" daemon prio=8 Id=10 WAITING on 
java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
        at [email protected]/java.lang.Object.wait0(Native Method)
        -  waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
        at [email protected]/java.lang.Object.wait(Unknown Source)
        at [email protected]/java.lang.Object.wait(Unknown Source)
        at 
[email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
        at 
[email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
        at 
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)


"Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE


"Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
        at 
[email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
        at 
[email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
        at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
        ...


"Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
        at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
        at 
[email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
        at 
[email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
        at 
[email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
        at 
[email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
        ...

        Number of locked synchronizers = 3
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1


"Notification Thread" daemon prio=9 Id=28 RUNNABLE


"Attach Listener" daemon prio=9 Id=46 RUNNABLE


"FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        ...


"ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
 prio=5 Id=289439 WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
        ...


"%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on 
java.util.concurrent.CountDownLatch$Sync@7fe7df3e
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
        at 
[email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
 Source)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
        ...

        Number of locked synchronizers = 1
        - java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c


"%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
        at 
[email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
        ...
{noformat}

Need to investigate and fix the issue.
The issue can be easily reproduced with adding a little delay to test method 
IginteTestUtils.runAsync(..) before returning future
{code:Java}
        }).start();
        // add sleep(100) here
        return fut;
{code}

*Note:* there is also a thread leak present, because after each test it is 
necessary to stop the running QueryTaskExecutor-s

  was:
In a very rare case the test hangs and fails with the following error:
{noformat}
java.util.concurrent.TimeoutException
        at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
        at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
        at 
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
        at 
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
        at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
{noformat}

The output also includes diagnostic information:
{noformat}
[2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution 
service could not be stopped within the specified timeout (5000 ms).Debug info 
for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1, 
executionToken=0] (canceled=true, stopped=false)
  Coordinator node: node_1 (current node)
  Root node state: completed exceptionally 
(org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The 
query was cancelled while executing. TraceId:ad076434)

  Fragments awaiting init completion:
    id=1, node=node_1
    id=2, node=node_1
    id=2, node=node_2
    id=2, node=node_3



Dumping threads:

"main" prio=5 Id=1 RUNNABLE
        at 
[email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
        at 
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
        at 
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
 Source)
        at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
        ...


"Reference Handler" daemon prio=10 Id=9 RUNNABLE
        at 
[email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native 
Method)
        at 
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
        at 
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)


"Finalizer" daemon prio=8 Id=10 WAITING on 
java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
        at [email protected]/java.lang.Object.wait0(Native Method)
        -  waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
        at [email protected]/java.lang.Object.wait(Unknown Source)
        at [email protected]/java.lang.Object.wait(Unknown Source)
        at 
[email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
        at 
[email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
        at 
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)


"Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE


"Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
        at 
[email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
        at 
[email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
        at 
[email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
        at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
        ...


"Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
        at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
        at 
[email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
        at 
[email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
        at 
[email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
        at 
[email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
        at 
[email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
        ...

        Number of locked synchronizers = 3
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
        - java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1


"Notification Thread" daemon prio=9 Id=28 RUNNABLE


"Attach Listener" daemon prio=9 Id=46 RUNNABLE


"FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        ...


"ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on 
java.util.concurrent.ForkJoinPool@66d314e3
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
        at 
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


"test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
 prio=5 Id=289439 WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
        at 
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
        ...


"%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on 
java.util.concurrent.CountDownLatch$Sync@7fe7df3e
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
        at 
[email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
 Source)
        at 
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
        ...

        Number of locked synchronizers = 1
        - java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c


"%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
        at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
        at 
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
        at 
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
        at 
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
        at 
[email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
        at 
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
        ...
{noformat}

Need to investigate and fix the issue.
The issue can be easily reproduced with adding a little delay to test method 
IginteTestUtils.runAsync(..) before returning future
{code:Java}
        }).start();
        // add sleep(100) here
        return fut;
{code}

*Note:* there is also a thread leak present, because after each test it is 
necessary to stop the running QueryTaskExecutor-s


> Sql. ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang may 
> hang
> ----------------------------------------------------------------------------------
>
>                 Key: IGNITE-28084
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28084
>             Project: Ignite
>          Issue Type: Bug
>          Components: sql ai3
>            Reporter: Pavel Pereslegin
>            Assignee: Pavel Pereslegin
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain, ignite-3
>
> In a very rare case the test {{outdatedNodeLeftEventDoesntCauseQueryToHang}} 
> hangs and fails with the following error:
> {noformat}
> java.util.concurrent.TimeoutException
>       at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
>       at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
>       at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
>       at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
>       at 
> org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:580)
>       at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
>       at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> {noformat}
> The output also includes diagnostic information:
> {noformat}
> [2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution 
> service could not be stopped within the specified timeout (5000 ms).Debug 
> info for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1, 
> executionToken=0] (canceled=true, stopped=false)
>   Coordinator node: node_1 (current node)
>   Root node state: completed exceptionally 
> (org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The 
> query was cancelled while executing. TraceId:ad076434)
>   Fragments awaiting init completion:
>     id=1, node=node_1
>     id=2, node=node_1
>     id=2, node=node_2
>     id=2, node=node_3
> Dumping threads:
> "main" prio=5 Id=1 RUNNABLE
>       at 
> [email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
>       at 
> [email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
>       at 
> [email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
>  Source)
>       at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
>       ...
> "Reference Handler" daemon prio=10 Id=9 RUNNABLE
>       at 
> [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native 
> Method)
>       at 
> [email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
>       at 
> [email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
> "Finalizer" daemon prio=8 Id=10 WAITING on 
> java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
>       at [email protected]/java.lang.Object.wait0(Native Method)
>       -  waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
>       at [email protected]/java.lang.Object.wait(Unknown Source)
>       at [email protected]/java.lang.Object.wait(Unknown Source)
>       at 
> [email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
>       at 
> [email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
>       at 
> [email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
>       at 
> [email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)
> "Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE
> "Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
>       at 
> [email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
>       at 
> [email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
>       at 
> [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
>       at 
> [email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
>       at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
>       ...
> "Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
>       at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
>       at 
> [email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
>       at 
> [email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
>       at 
> [email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
>       at 
> [email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
>       at 
> [email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
>       at 
> [email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
>       at 
> [email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
>       ...
>       Number of locked synchronizers = 3
>       - java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
>       - java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
>       - java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1
> "Notification Thread" daemon prio=9 Id=28 RUNNABLE
> "Attach Listener" daemon prio=9 Id=46 RUNNABLE
> "FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
>       at 
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
>       at 
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
>       at 
> [email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
>       at 
> [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>       at 
> [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>       ...
> "ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on 
> java.util.concurrent.ForkJoinPool@66d314e3
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
>       at 
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on 
> java.util.concurrent.ForkJoinPool@66d314e3
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
>       at 
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on 
> java.util.concurrent.ForkJoinPool@66d314e3
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
>       at 
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on 
> java.util.concurrent.ForkJoinPool@66d314e3
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on java.util.concurrent.ForkJoinPool@66d314e3
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
>       at 
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
>  prio=5 Id=289439 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
>       at 
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
>       at 
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
>       ...
> "%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on 
> java.util.concurrent.CountDownLatch$Sync@7fe7df3e
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
>       at 
> [email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
>  Source)
>       at 
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
>       ...
>       Number of locked synchronizers = 1
>       - java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c
> "%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
>       at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
>       -  waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
>       at 
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
>       at 
> [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
>       at 
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
>       at 
> [email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
>       at 
> [email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
>       ...
> {noformat}
> Need to investigate and fix the issue.
> The issue can be easily reproduced with adding a little delay to test method 
> IginteTestUtils.runAsync(..) before returning future
> {code:Java}
>         }).start();
>         // add sleep(100) here
>         return fut;
> {code}
> *Note:* there is also a thread leak present, because after each test it is 
> necessary to stop the running QueryTaskExecutor-s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to