[
https://issues.apache.org/jira/browse/IGNITE-28084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Pereslegin updated IGNITE-28084:
--------------------------------------
Description:
In a very rare case the test {{outdatedNodeLeftEventDoesntCauseQueryToHang}}
hangs and fails with the following error:
{noformat}
java.util.concurrent.TimeoutException
at
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
at
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
at
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
at
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
{noformat}
The output also includes diagnostic information:
{noformat}
[2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution
service could not be stopped within the specified timeout (5000 ms).Debug info
for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1,
executionToken=0] (canceled=true, stopped=false)
Coordinator node: node_1 (current node)
Root node state: completed exceptionally
(org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The
query was cancelled while executing. TraceId:ad076434)
Fragments awaiting init completion:
id=1, node=node_1
id=2, node=node_1
id=2, node=node_2
id=2, node=node_3
Dumping threads:
"main" prio=5 Id=1 RUNNABLE
at
[email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
at
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
at
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
Source)
at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
...
"Reference Handler" daemon prio=10 Id=9 RUNNABLE
at
[email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
Method)
at
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
at
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
"Finalizer" daemon prio=8 Id=10 WAITING on
java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
at [email protected]/java.lang.Object.wait0(Native Method)
- waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
at [email protected]/java.lang.Object.wait(Unknown Source)
at [email protected]/java.lang.Object.wait(Unknown Source)
at
[email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
at
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
at
[email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
at
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)
"Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE
"Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
at
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
at
[email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
at
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
at
[email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
at
[email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
...
"Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
at
[email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
at
[email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
at
[email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
at
[email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
at
[email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
at
[email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
at
[email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
...
Number of locked synchronizers = 3
- java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
- java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
- java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1
"Notification Thread" daemon prio=9 Id=28 RUNNABLE
"Attach Listener" daemon prio=9 Id=46 RUNNABLE
"FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
at
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
...
"ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
prio=5 Id=289439 WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
at
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
at
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
...
"%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on
java.util.concurrent.CountDownLatch$Sync@7fe7df3e
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
at
[email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
Source)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
...
Number of locked synchronizers = 1
- java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c
"%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
at
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
at
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
at
[email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
...
{noformat}
Need to investigate and fix the issue.
The issue can be easily reproduced with adding a little delay to test method
IginteTestUtils.runAsync(..) before returning future
{code:Java}
}).start();
// add sleep(100) here
return fut;
{code}
*Note:* there is also a thread leak present, because after each test it is
necessary to stop the running QueryTaskExecutor-s
was:
In a very rare case the test hangs and fails with the following error:
{noformat}
java.util.concurrent.TimeoutException
at
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
at
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
at
org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
at
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
{noformat}
The output also includes diagnostic information:
{noformat}
[2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution
service could not be stopped within the specified timeout (5000 ms).Debug info
for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1,
executionToken=0] (canceled=true, stopped=false)
Coordinator node: node_1 (current node)
Root node state: completed exceptionally
(org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The
query was cancelled while executing. TraceId:ad076434)
Fragments awaiting init completion:
id=1, node=node_1
id=2, node=node_1
id=2, node=node_2
id=2, node=node_3
Dumping threads:
"main" prio=5 Id=1 RUNNABLE
at
[email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
at
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
at
[email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
Source)
at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
...
"Reference Handler" daemon prio=10 Id=9 RUNNABLE
at
[email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
Method)
at
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
at
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
"Finalizer" daemon prio=8 Id=10 WAITING on
java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
at [email protected]/java.lang.Object.wait0(Native Method)
- waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
at [email protected]/java.lang.Object.wait(Unknown Source)
at [email protected]/java.lang.Object.wait(Unknown Source)
at
[email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
at
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
at
[email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
at
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)
"Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE
"Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
at
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
at
[email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
at
[email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
at
[email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
at
[email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
...
"Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
at
[email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
at
[email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
at
[email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
at
[email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
at
[email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
at
[email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
at
[email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
...
Number of locked synchronizers = 3
- java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
- java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
- java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1
"Notification Thread" daemon prio=9 Id=28 RUNNABLE
"Attach Listener" daemon prio=9 Id=46 RUNNABLE
"FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
at
[email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
...
"ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on
java.util.concurrent.ForkJoinPool@66d314e3
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.ForkJoinPool@66d314e3
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
"test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
prio=5 Id=289439 WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
at
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
at
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
at
[email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
...
"%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on
java.util.concurrent.CountDownLatch$Sync@7fe7df3e
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
at
[email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
Source)
at
app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
...
Number of locked synchronizers = 1
- java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c
"%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
at
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
at
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
at
[email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
at
[email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
at
[email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
...
{noformat}
Need to investigate and fix the issue.
The issue can be easily reproduced with adding a little delay to test method
IginteTestUtils.runAsync(..) before returning future
{code:Java}
}).start();
// add sleep(100) here
return fut;
{code}
*Note:* there is also a thread leak present, because after each test it is
necessary to stop the running QueryTaskExecutor-s
> Sql. ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang may
> hang
> ----------------------------------------------------------------------------------
>
> Key: IGNITE-28084
> URL: https://issues.apache.org/jira/browse/IGNITE-28084
> Project: Ignite
> Issue Type: Bug
> Components: sql ai3
> Reporter: Pavel Pereslegin
> Assignee: Pavel Pereslegin
> Priority: Major
> Labels: MakeTeamcityGreenAgain, ignite-3
>
> In a very rare case the test {{outdatedNodeLeftEventDoesntCauseQueryToHang}}
> hangs and fails with the following error:
> {noformat}
> java.util.concurrent.TimeoutException
> at
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
> at
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
> at
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:853)
> at
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:882)
> at
> org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.outdatedNodeLeftEventDoesntCauseQueryToHang(ExecutionServiceImplTest.java:1112)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> {noformat}
> The output also includes diagnostic information:
> {noformat}
> [2026-03-05T20:54:11,288][WARN ][main][ExecutionServiceImpl] SQL execution
> service could not be stopped within the specified timeout (5000 ms).Debug
> info for query: ExecutionId [queryId=e629446d-3a49-4651-865e-e73668103ba1,
> executionToken=0] (canceled=true, stopped=false)
> Coordinator node: node_1 (current node)
> Root node state: completed exceptionally
> (org.apache.ignite.internal.sql.engine.QueryCancelledException: IGN-SQL-6 The
> query was cancelled while executing. TraceId:ad076434)
> Fragments awaiting init completion:
> id=1, node=node_1
> id=2, node=node_1
> id=2, node=node_2
> id=2, node=node_3
> Dumping threads:
> "main" prio=5 Id=1 RUNNABLE
> at
> [email protected]/sun.management.ThreadImpl.dumpThreads0(Native Method)
> at
> [email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:518)
> at
> [email protected]/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:506)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.dumpThreads(ExecutionServiceImpl.java:898)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:712)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$tearDown$5(ExecutionServiceImplTest.java:334)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f0834488.accept(Unknown
> Source)
> at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1596)
> ...
> "Reference Handler" daemon prio=10 Id=9 RUNNABLE
> at
> [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
> Method)
> at
> [email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
> at
> [email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
> "Finalizer" daemon prio=8 Id=10 WAITING on
> java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
> at [email protected]/java.lang.Object.wait0(Native Method)
> - waiting on java.lang.ref.NativeReferenceQueue$Lock@19c2e5df
> at [email protected]/java.lang.Object.wait(Unknown Source)
> at [email protected]/java.lang.Object.wait(Unknown Source)
> at
> [email protected]/java.lang.ref.NativeReferenceQueue.await(NativeReferenceQueue.java:48)
> at
> [email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:158)
> at
> [email protected]/java.lang.ref.NativeReferenceQueue.remove(NativeReferenceQueue.java:89)
> at
> [email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:173)
> "Signal Dispatcher" daemon prio=9 Id=11 RUNNABLE
> "Common-Cleaner" daemon prio=8 Id=26 TIMED_WAITING on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6c629587
> at
> [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1886)
> at
> [email protected]/java.lang.ref.ReferenceQueue.await(ReferenceQueue.java:71)
> at
> [email protected]/java.lang.ref.ReferenceQueue.remove0(ReferenceQueue.java:143)
> at
> [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:218)
> at
> [email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
> at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
> ...
> "Monitor Ctrl-Break" daemon prio=5 Id=27 RUNNABLE (in native)
> at [email protected]/sun.nio.ch.SocketDispatcher.read0(Native Method)
> at
> [email protected]/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:47)
> at
> [email protected]/sun.nio.ch.NioSocketImpl.tryRead(NioSocketImpl.java:256)
> at
> [email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:307)
> at
> [email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
> at
> [email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
> at
> [email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)
> at
> [email protected]/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:350)
> ...
> Number of locked synchronizers = 3
> - java.util.concurrent.locks.ReentrantLock$NonfairSync@5f91d91f
> - java.util.concurrent.locks.ReentrantLock$NonfairSync@410ff902
> - java.util.concurrent.locks.ReentrantLock$NonfairSync@4fed00e1
> "Notification Thread" daemon prio=9 Id=28 RUNNABLE
> "Attach Listener" daemon prio=9 Id=46 RUNNABLE
> "FastTimestamps updater" daemon prio=5 Id=49 TIMED_WAITING on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@375b6a36
> at
> [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:269)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1797)
> at
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
> at
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> ...
> "ForkJoinPool.commonPool-worker-1" daemon prio=5 Id=51 WAITING on
> java.util.concurrent.ForkJoinPool@66d314e3
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.ForkJoinPool@66d314e3
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
> at
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-2" daemon prio=5 Id=65 TIMED_WAITING on
> java.util.concurrent.ForkJoinPool@66d314e3
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.ForkJoinPool@66d314e3
> at
> [email protected]/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:449)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1891)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
> at
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-3" daemon prio=5 Id=66 WAITING on
> java.util.concurrent.ForkJoinPool@66d314e3
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.ForkJoinPool@66d314e3
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
> at
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "ForkJoinPool.commonPool-worker-4" daemon prio=5 Id=1027 WAITING on
> java.util.concurrent.ForkJoinPool@66d314e3
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.ForkJoinPool@66d314e3
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
> at
> [email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> "test-ExecutionServiceImplTest-outdatedNodeLeftEventDoesntCauseQueryToHang-commonExecutor0"
> prio=5 Id=289439 WAITING on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@582a26d2
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
> at
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
> at
> [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
> ...
> "%node_1%sql-execution-pool-0" prio=5 Id=289442 WAITING on
> java.util.concurrent.CountDownLatch$Sync@7fe7df3e
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on java.util.concurrent.CountDownLatch$Sync@7fe7df3e
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:788)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1133)
> at
> [email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.lambda$outdatedNodeLeftEventDoesntCauseQueryToHang$69(ExecutionServiceImplTest.java:1089)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$$Lambda/0x00007fd8f079ca28.intercept(Unknown
> Source)
> at
> app//org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1626)
> ...
> Number of locked synchronizers = 1
> - java.util.concurrent.ThreadPoolExecutor$Worker@3e398f9c
> "%node_1%sql-execution-pool-1" prio=5 Id=289444 WAITING on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
> at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
> - waiting on
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1301102e
> at
> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
> at
> [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
> at
> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1746)
> at
> [email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
> ...
> {noformat}
> Need to investigate and fix the issue.
> The issue can be easily reproduced with adding a little delay to test method
> IginteTestUtils.runAsync(..) before returning future
> {code:Java}
> }).start();
> // add sleep(100) here
> return fut;
> {code}
> *Note:* there is also a thread leak present, because after each test it is
> necessary to stop the running QueryTaskExecutor-s
--
This message was sent by Atlassian Jira
(v8.20.10#820010)