Hello! This looks like the issue https://issues.apache.org/jira/browse/IGNITE-10910 however not exactly. Is it possible to create a reproducer for such behavior? Maybe it is already fixed in master since there was a few tickets regarding deadlocks on node stop.
Regards, -- Ilya Kasnacheev пн, 10 июн. 2019 г. в 21:31, Loredana Radulescu Ivanoff <[email protected] >: > Hello, > > I am using Ignite 2.7 embedded inside a Tomcat application, and have run > into an issue where the application does not shut down due to a blocked > Ignite thread. I think it would be good for Ignite to avoid hanging in this > situation, what do you think? Here are the details: > > 1. Ignite node gets segmented due to CPU pressure (this part is on purpose) > 2. The StopNodeFailureHandler invokes the stop procedure, during which > process a lock is acquired. > 3. The application detects the segmentation via custom code and also > starts shutting down, during which process it also tells Ignite to stop via > Ignitition.allGrids().close() > 4. The close process triggered by #3 waits for the lock acquired in step > #2 and remains blocked forever, preventing the application from shutting > down. > > Here are the stack traces from the two threads I mentioned: > > First: > > --------------------- > "node-stopper" #272 prio=5 os_prio=0 tid=0x00007ffb1801d000 nid=0x126c > waiting on condition [0x00007ffaf16d6000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000cb189368> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115) > at > org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220) > at > org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:315) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:1102) > at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2344) > at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612) > - locked <0x00000000c71e9000> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575) > at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379) > at > org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36) > at java.lang.Thread.run(Thread.java:748) > --------------------------------------------------------------------- > > Second (omitting custom code at the top of the trace): > > ----------------------------- > "pool-18-thread-2" #127 prio=5 os_prio=0 tid=0x00007ffb3663e000 nid=0x1146 > waiting for monitor entry [0x00007ffad94a6000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2583) > - waiting to lock <0x00000000c71e9000> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575) > at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379) > at org.apache.ignite.Ignition.stop(Ignition.java:225) > at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:3568) > application code > ---------------------------------------------------------------------- > > Thank you! > > >
