Roman Puchkovskiy created IGNITE-20750:
------------------------------------------
Summary: ExecutionServiceImpl#stop() may hang forever
Key: IGNITE-20750
URL: https://issues.apache.org/jira/browse/IGNITE-20750
Project: Ignite
Issue Type: Bug
Reporter: Roman Puchkovskiy
Fix For: 3.0.0-beta2
A build hung on TC:
[https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunner/7589436?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildProblemsSection=true&expandBuildChangesSection=true]
In the thread dump the following can be seen:
"Test worker" #1 prio=5 os_prio=0 cpu=123640.80ms elapsed=3573.05s
tid=0x00007f8de802e000 nid=0x2110df waiting on condition [0x00007f8decb1d000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x000000071962ff08> (a
java.util.concurrent.CompletableFuture$Signaller)
at
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
at
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796)
at
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128)
at
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823)
at
java.util.concurrent.CompletableFuture.join([email protected]/CompletableFuture.java:2043)
at
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.stop(ExecutionServiceImpl.java:402)
at
org.apache.ignite.internal.sql.engine.SqlQueryProcessor$$Lambda$2103/0x0000000800ba7840.close(Unknown
Source)
at
org.apache.ignite.internal.util.IgniteUtils.lambda$closeAll$0(IgniteUtils.java:534)
at
org.apache.ignite.internal.util.IgniteUtils$$Lambda$2054/0x0000000800b8f040.accept(Unknown
Source)
at
java.util.stream.ForEachOps$ForEachOp$OfRef.accept([email protected]/ForEachOps.java:183)
at
java.util.stream.ReferencePipeline$2$1.accept([email protected]/ReferencePipeline.java:177)
at
java.util.stream.ReferencePipeline$3$1.accept([email protected]/ReferencePipeline.java:195)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining([email protected]/ArrayList.java:1655)
at
java.util.stream.AbstractPipeline.copyInto([email protected]/AbstractPipeline.java:484)
at
java.util.stream.AbstractPipeline.wrapAndCopyInto([email protected]/AbstractPipeline.java:474)
at
java.util.stream.ForEachOps$ForEachOp.evaluateSequential([email protected]/ForEachOps.java:150)
at
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential([email protected]/ForEachOps.java:173)
at
java.util.stream.AbstractPipeline.evaluate([email protected]/AbstractPipeline.java:234)
at
java.util.stream.ReferencePipeline.forEach([email protected]/ReferencePipeline.java:497)
at
org.apache.ignite.internal.util.IgniteUtils.closeAll(IgniteUtils.java:532)
at
org.apache.ignite.internal.sql.engine.SqlQueryProcessor.stop(SqlQueryProcessor.java:380)
- locked <0x0000000721d65408> (a
org.apache.ignite.internal.sql.engine.SqlQueryProcessor)
at
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
at
org.apache.ignite.internal.app.LifecycleManager$$Lambda$2100/0x0000000800ba6c40.accept(Unknown
Source)
at java.util.Iterator.forEachRemaining([email protected]/Iterator.java:133)
at
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
- locked <0x000000071e1eb730> (a
org.apache.ignite.internal.app.LifecycleManager)
at
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115)
at org.apache.ignite.internal.app.IgniteImpl.stop(IgniteImpl.java:903)
at
org.apache.ignite.internal.app.IgnitionImpl.lambda$stop$0(IgnitionImpl.java:113)
at
org.apache.ignite.internal.app.IgnitionImpl$$Lambda$2056/0x0000000800b8f840.apply(Unknown
Source)
at
java.util.concurrent.ConcurrentHashMap.computeIfPresent([email protected]/ConcurrentHashMap.java:1822)
- locked <0x0000000736357750> (a
java.util.concurrent.ConcurrentHashMap$Node)
at org.apache.ignite.internal.app.IgnitionImpl.stop(IgnitionImpl.java:111)
at org.apache.ignite.IgnitionManager.stop(IgnitionManager.java:96)
at org.apache.ignite.IgnitionManager.stop(IgnitionManager.java:82)
at org.apache.ignite.internal.Cluster.lambda$shutdown$11(Cluster.java:458)
at
org.apache.ignite.internal.Cluster$$Lambda$2318/0x0000000800d89040.accept(Unknown
Source)
at java.util.ArrayList.forEach([email protected]/ArrayList.java:1541)
at org.apache.ignite.internal.Cluster.shutdown(Cluster.java:458)
at
org.apache.ignite.internal.ClusterPerClassIntegrationTest.afterAll(ClusterPerClassIntegrationTest.java:103)
at jdk.internal.reflect.GeneratedMethodAccessor145.invoke(Unknown Source)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke([email protected]/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke([email protected]/Method.java:566)
at
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
at
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
at
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
Looks like {{ExecutionServiceImpl#stop()}} hung forever. It has {{f.join()}}
call; the future seems to never get completed.
There are 2 problems:
# How does it happen that the future never gets completed?
# Probably we should have some maximum time-to-wait for stop to allow a node
to stop as a whole (like 30 seconds?)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)