[
https://issues.apache.org/jira/browse/IGNITE-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369680#comment-16369680
]
Roman Shtykh commented on IGNITE-7696:
--------------------------------------
[~frsyuki] I suggest you provide cluster configuration details and raise
attention to the issue in the ml.
> Deadlock at GridDhtAtomicCache.lockEntries called through
> GridDhtAtomicCache.updateAllAsyncInternal
> ---------------------------------------------------------------------------------------------------
>
> Key: IGNITE-7696
> URL: https://issues.apache.org/jira/browse/IGNITE-7696
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.3
> Environment: * Ignite 2.3
> * OpenJDK version "1.8.0_151"
> * Linux 4.4.0
> Reporter: Sadayuki Furuhashi
> Priority: Major
>
> We observed that all nodes in a cluster completely stalls and put/get/remove
> operations to a cache block for ever. When it happens, we can see following
> log in thread dump:
> {code}
> 2018-02-14_04:21:33.84410 Found one Java-level deadlock:
> 2018-02-14_04:21:33.84410 =============================
> 2018-02-14_04:21:33.84411 "sys-#41%IgniteManager%":
> 2018-02-14_04:21:33.84411 waiting to lock monitor 0x00007f6d5e41a558
> (object 0x0000000781083ef0, a
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry),
> 2018-02-14_04:21:33.84411 which is held by "sys-stripe-5-#6%IgniteManager%"
> 2018-02-14_04:21:33.84412 "sys-stripe-5-#6%IgniteManager%":
> 2018-02-14_04:21:33.84412 waiting to lock monitor 0x00007f6d5e41de68
> (object 0x0000000781083e70, a
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
> 2018-02-14_04:21:33.84412 in JNI, which is held by
> "sys-stripe-2-#3%IgniteManager%"
> 2018-02-14_04:21:33.84412 "sys-stripe-2-#3%IgniteManager%":
> 2018-02-14_04:21:33.84413 waiting to lock monitor 0x00007f6d5e41a558
> (object 0x0000000781083ef0, a
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
> 2018-02-14_04:21:33.84413 in JNI, which is held by
> "sys-stripe-5-#6%IgniteManager%"
> 2018-02-14_04:21:33.84413
> 2018-02-14_04:21:33.84414 Java stack information for the threads listed above:
> 2018-02-14_04:21:33.84414 ===================================================
> 2018-02-14_04:21:33.84416 "sys-#41%IgniteManager%":
> 2018-02-14_04:21:33.84416 at
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.markObsoleteVersion(GridCacheMapEntry.java:2153)
> 2018-02-14_04:21:33.84417 - waiting to lock <0x0000000781083ef0> (a
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
> 2018-02-14_04:21:33.84417 at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.removeVersionedEntry(GridDhtLocalPartition.java:368)
> 2018-02-14_04:21:33.84418 at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.cleanupRemoveQueue(GridDhtLocalPartition.java:392)
> 2018-02-14_04:21:33.84418 at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor$RemovedItemsCleanupTask$1.run(GridCacheProcessor.java:4051)
> 2018-02-14_04:21:33.84418 at
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6687)
> 2018-02-14_04:21:33.84419 at
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> 2018-02-14_04:21:33.84419 at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 2018-02-14_04:21:33.84419 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 2018-02-14_04:21:33.84420 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 2018-02-14_04:21:33.84421 at java.lang.Thread.run(Thread.java:748)
> 2018-02-14_04:21:33.84421 "sys-stripe-5-#6%IgniteManager%":
> 2018-02-14_04:21:33.84421 at sun.misc.Unsafe.monitorEnter(Native Method)
> 2018-02-14_04:21:33.84421 at
> org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
> 2018-02-14_04:21:33.84422 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
> 2018-02-14_04:21:33.84422 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
> 2018-02-14_04:21:33.84423 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
> 2018-02-14_04:21:33.84423 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
> 2018-02-14_04:21:33.84424 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
> 2018-02-14_04:21:33.84424 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
> 2018-02-14_04:21:33.84425 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
> 2018-02-14_04:21:33.84425 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
> 2018-02-14_04:21:33.84425 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
> 2018-02-14_04:21:33.84426 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
> 2018-02-14_04:21:33.84426 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
> 2018-02-14_04:21:33.84426 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
> 2018-02-14_04:21:33.84427 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
> 2018-02-14_04:21:33.84427 at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
> 2018-02-14_04:21:33.84427 at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
> 2018-02-14_04:21:33.84428 at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
> 2018-02-14_04:21:33.84429 at
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
> 2018-02-14_04:21:33.84429 at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
> 2018-02-14_04:21:33.84429 at java.lang.Thread.run(Thread.java:748)
> 2018-02-14_04:21:33.84429 "sys-stripe-2-#3%IgniteManager%":
> 2018-02-14_04:21:33.84430 at sun.misc.Unsafe.monitorEnter(Native Method)
> 2018-02-14_04:21:33.84430 at
> org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
> 2018-02-14_04:21:33.84430 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
> 2018-02-14_04:21:33.84431 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
> 2018-02-14_04:21:33.84431 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
> 2018-02-14_04:21:33.84431 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
> 2018-02-14_04:21:33.84433 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
> 2018-02-14_04:21:33.84433 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
> 2018-02-14_04:21:33.84433 at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
> 2018-02-14_04:21:33.84434 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
> 2018-02-14_04:21:33.84434 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
> 2018-02-14_04:21:33.84434 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
> 2018-02-14_04:21:33.84435 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
> 2018-02-14_04:21:33.84435 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
> 2018-02-14_04:21:33.84436 at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
> 2018-02-14_04:21:33.84437 at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
> 2018-02-14_04:21:33.84437 at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
> 2018-02-14_04:21:33.84437 at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
> 2018-02-14_04:21:33.84438 at
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
> 2018-02-14_04:21:33.84438 at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
> 2018-02-14_04:21:33.84438 at java.lang.Thread.run(Thread.java:748)
> 2018-02-14_04:21:33.84439
> 2018-02-14_04:21:33.84439 Found 1 deadlock.
> {code}
> This indicates that GridDhtAtomicCache.lockEntries (threads sys-stripe-5-...
> and sys-stripe-7-...) is causing deadlock, and
> GridCacheMapEntry.markObsoleteVersion (sys-#41...) is involved in it:
> * Thread "sys-stripe-5-..." locked 0x...3ef0, waits for 0x...3e70
> * Thread "sys-stripe-7-..." locked 0x...3e70, waits for 0x...3ef0
> * Thread "sys-#41..." waits for 0x...3ef0
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)