[ https://issues.apache.org/jira/browse/IGNITE-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369680#comment-16369680 ]
Roman Shtykh commented on IGNITE-7696: -------------------------------------- [~frsyuki] I suggest you provide cluster configuration details and raise attention to the issue in the ml. > Deadlock at GridDhtAtomicCache.lockEntries called through > GridDhtAtomicCache.updateAllAsyncInternal > --------------------------------------------------------------------------------------------------- > > Key: IGNITE-7696 > URL: https://issues.apache.org/jira/browse/IGNITE-7696 > Project: Ignite > Issue Type: Bug > Components: cache > Affects Versions: 2.3 > Environment: * Ignite 2.3 > * OpenJDK version "1.8.0_151" > * Linux 4.4.0 > Reporter: Sadayuki Furuhashi > Priority: Major > > We observed that all nodes in a cluster completely stalls and put/get/remove > operations to a cache block for ever. When it happens, we can see following > log in thread dump: > {code} > 2018-02-14_04:21:33.84410 Found one Java-level deadlock: > 2018-02-14_04:21:33.84410 ============================= > 2018-02-14_04:21:33.84411 "sys-#41%IgniteManager%": > 2018-02-14_04:21:33.84411 waiting to lock monitor 0x00007f6d5e41a558 > (object 0x0000000781083ef0, a > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry), > 2018-02-14_04:21:33.84411 which is held by "sys-stripe-5-#6%IgniteManager%" > 2018-02-14_04:21:33.84412 "sys-stripe-5-#6%IgniteManager%": > 2018-02-14_04:21:33.84412 waiting to lock monitor 0x00007f6d5e41de68 > (object 0x0000000781083e70, a > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry) > 2018-02-14_04:21:33.84412 in JNI, which is held by > "sys-stripe-2-#3%IgniteManager%" > 2018-02-14_04:21:33.84412 "sys-stripe-2-#3%IgniteManager%": > 2018-02-14_04:21:33.84413 waiting to lock monitor 0x00007f6d5e41a558 > (object 0x0000000781083ef0, a > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry) > 2018-02-14_04:21:33.84413 in JNI, which is held by > "sys-stripe-5-#6%IgniteManager%" > 2018-02-14_04:21:33.84413 > 2018-02-14_04:21:33.84414 Java stack information for the threads listed above: > 2018-02-14_04:21:33.84414 =================================================== > 2018-02-14_04:21:33.84416 "sys-#41%IgniteManager%": > 2018-02-14_04:21:33.84416 at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry.markObsoleteVersion(GridCacheMapEntry.java:2153) > 2018-02-14_04:21:33.84417 - waiting to lock <0x0000000781083ef0> (a > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry) > 2018-02-14_04:21:33.84417 at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.removeVersionedEntry(GridDhtLocalPartition.java:368) > 2018-02-14_04:21:33.84418 at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.cleanupRemoveQueue(GridDhtLocalPartition.java:392) > 2018-02-14_04:21:33.84418 at > org.apache.ignite.internal.processors.cache.GridCacheProcessor$RemovedItemsCleanupTask$1.run(GridCacheProcessor.java:4051) > 2018-02-14_04:21:33.84418 at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6687) > 2018-02-14_04:21:33.84419 at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) > 2018-02-14_04:21:33.84419 at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > 2018-02-14_04:21:33.84419 at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > 2018-02-14_04:21:33.84420 at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > 2018-02-14_04:21:33.84421 at java.lang.Thread.run(Thread.java:748) > 2018-02-14_04:21:33.84421 "sys-stripe-5-#6%IgniteManager%": > 2018-02-14_04:21:33.84421 at sun.misc.Unsafe.monitorEnter(Native Method) > 2018-02-14_04:21:33.84421 at > org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207) > 2018-02-14_04:21:33.84422 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848) > 2018-02-14_04:21:33.84422 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707) > 2018-02-14_04:21:33.84423 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629) > 2018-02-14_04:21:33.84423 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056) > 2018-02-14_04:21:33.84424 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131) > 2018-02-14_04:21:33.84424 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267) > 2018-02-14_04:21:33.84425 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262) > 2018-02-14_04:21:33.84425 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060) > 2018-02-14_04:21:33.84425 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) > 2018-02-14_04:21:33.84426 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378) > 2018-02-14_04:21:33.84426 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304) > 2018-02-14_04:21:33.84426 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99) > 2018-02-14_04:21:33.84427 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293) > 2018-02-14_04:21:33.84427 at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555) > 2018-02-14_04:21:33.84427 at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183) > 2018-02-14_04:21:33.84428 at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) > 2018-02-14_04:21:33.84429 at > org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090) > 2018-02-14_04:21:33.84429 at > org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505) > 2018-02-14_04:21:33.84429 at java.lang.Thread.run(Thread.java:748) > 2018-02-14_04:21:33.84429 "sys-stripe-2-#3%IgniteManager%": > 2018-02-14_04:21:33.84430 at sun.misc.Unsafe.monitorEnter(Native Method) > 2018-02-14_04:21:33.84430 at > org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207) > 2018-02-14_04:21:33.84430 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848) > 2018-02-14_04:21:33.84431 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707) > 2018-02-14_04:21:33.84431 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629) > 2018-02-14_04:21:33.84431 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056) > 2018-02-14_04:21:33.84433 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131) > 2018-02-14_04:21:33.84433 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267) > 2018-02-14_04:21:33.84433 at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262) > 2018-02-14_04:21:33.84434 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060) > 2018-02-14_04:21:33.84434 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) > 2018-02-14_04:21:33.84434 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378) > 2018-02-14_04:21:33.84435 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304) > 2018-02-14_04:21:33.84435 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99) > 2018-02-14_04:21:33.84436 at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293) > 2018-02-14_04:21:33.84437 at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555) > 2018-02-14_04:21:33.84437 at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183) > 2018-02-14_04:21:33.84437 at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) > 2018-02-14_04:21:33.84438 at > org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090) > 2018-02-14_04:21:33.84438 at > org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505) > 2018-02-14_04:21:33.84438 at java.lang.Thread.run(Thread.java:748) > 2018-02-14_04:21:33.84439 > 2018-02-14_04:21:33.84439 Found 1 deadlock. > {code} > This indicates that GridDhtAtomicCache.lockEntries (threads sys-stripe-5-... > and sys-stripe-7-...) is causing deadlock, and > GridCacheMapEntry.markObsoleteVersion (sys-#41...) is involved in it: > * Thread "sys-stripe-5-..." locked 0x...3ef0, waits for 0x...3e70 > * Thread "sys-stripe-7-..." locked 0x...3e70, waits for 0x...3ef0 > * Thread "sys-#41..." waits for 0x...3ef0 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)