Hello, I am using Ignite 2.7 embedded inside a Tomcat application, and have run into an issue where the application does not shut down due to a blocked Ignite thread. I think it would be good for Ignite to avoid hanging in this situation, what do you think? Here are the details:
1. Ignite node gets segmented due to CPU pressure (this part is on purpose) 2. The StopNodeFailureHandler invokes the stop procedure, during which process a lock is acquired. 3. The application detects the segmentation via custom code and also starts shutting down, during which process it also tells Ignite to stop via Ignitition.allGrids().close() 4. The close process triggered by #3 waits for the lock acquired in step #2 and remains blocked forever, preventing the application from shutting down. Here are the stack traces from the two threads I mentioned: First: --------------------- "node-stopper" #272 prio=5 os_prio=0 tid=0x00007ffb1801d000 nid=0x126c waiting on condition [0x00007ffaf16d6000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000cb189368> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115) at org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220) at org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:315) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:1102) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2344) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612) - locked <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575) at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379) at org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36) at java.lang.Thread.run(Thread.java:748) --------------------------------------------------------------------- Second (omitting custom code at the top of the trace): ----------------------------- "pool-18-thread-2" #127 prio=5 os_prio=0 tid=0x00007ffb3663e000 nid=0x1146 waiting for monitor entry [0x00007ffad94a6000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2583) - waiting to lock <0x00000000c71e9000> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575) at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:379) at org.apache.ignite.Ignition.stop(Ignition.java:225) at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:3568) application code ---------------------------------------------------------------------- Thank you!
