Anilkumar Gingade created GEODE-6581:
----------------------------------------

             Summary: deadlock between tombstone gc and region initial image 
processing threads
                 Key: GEODE-6581
                 URL: https://issues.apache.org/jira/browse/GEODE-6581
             Project: Geode
          Issue Type: Bug
          Components: regions
            Reporter: Anilkumar Gingade


There is a potential for tombstoneGC thread to be dead-locked with initial 
image processing thread due to the order of locking with RegionEntry and 
RegionSize lock (below stack dump).

This is similar to GEODE-6526 except here the removeTombstone is called from 
initial image put opertation.
{noformat}
Found one Java-level deadlock:
=============================
"Pooled Message Processor 90":
  waiting to lock monitor 0x00007fa780005978 (object 0x00000000fa19a890, a 
java.lang.Object),
  which is held by "Pooled Message Processor 14"
"Pooled Message Processor 14":
  waiting to lock monitor 0x00007fa7940edb68 (object 0x00000000fa633db0, a 
org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2),
  which is held by "Pooled High Priority Message Processor 10"
"Pooled High Priority Message Processor 10":
  waiting to lock monitor 0x00007fa7940edab8 (object 0x00000000fbdaf610, a 
java.lang.String),
  which is held by "Pooled Message Processor 14"

Java stack information for the threads listed above:
===================================================
"Pooled Message Processor 90":
        at 
org.apache.geode.internal.cache.TombstoneService.gcTombstones(TombstoneService.java:209)
        - waiting to lock <0x00000000fa19a890> (a java.lang.Object)
        at 
org.apache.geode.internal.cache.LocalRegion.expireTombstones(LocalRegion.java:3293)
        at 
org.apache.geode.internal.cache.DistributedTombstoneOperation$TombstoneMessage.operateOnRegion(DistributedTombstoneOperation.java:169)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1191)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1091)
        at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378)
        at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$4$1.run(ClusterDistributionManager.java:791)
        at java.lang.Thread.run(Thread.java:748)
"Pooled Message Processor 14":
        at 
org.apache.geode.internal.cache.AbstractRegionMap.removeTombstone(AbstractRegionMap.java:3322)
        - waiting to lock <0x00000000fa633db0> (a 
org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2)
        - locked <0x00000000fbdaf610> (a java.lang.String)
        at 
org.apache.geode.internal.cache.TombstoneService.gcTombstones(TombstoneService.java:259)
        - locked <0x00000000fa19a890> (a java.lang.Object)
        at 
org.apache.geode.internal.cache.LocalRegion.expireTombstones(LocalRegion.java:3293)
        at 
org.apache.geode.internal.cache.DistributedTombstoneOperation$TombstoneMessage.operateOnRegion(DistributedTombstoneOperation.java:169)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1191)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1091)
        at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378)
        at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$4$1.run(ClusterDistributionManager.java:791)
        at java.lang.Thread.run(Thread.java:748)
"Pooled High Priority Message Processor 10":
        at 
org.apache.geode.internal.cache.AbstractRegionMap.removeTombstone(AbstractRegionMap.java:3321)
        - waiting to lock <0x00000000fbdaf610> (a java.lang.String)
        at 
org.apache.geode.internal.cache.AbstractRegionMap.initialImagePut(AbstractRegionMap.java:954)
        - locked <0x00000000fa633db0> (a 
org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2)
        - locked <0x00000000fbde7e18> (a 
org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2)
        at 
org.apache.geode.internal.cache.InitialImageOperation.processChunk(InitialImageOperation.java:933)
        - locked <0x00000000fa633db0> (a 
org.apache.geode.internal.cache.entries.VersionedThinDiskRegionEntryHeapStringKey2)
        at 
org.apache.geode.internal.cache.InitialImageOperation$ImageProcessor.process(InitialImageOperation.java:1309)
        at 
org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:213)
        at 
org.apache.geode.internal.cache.InitialImageOperation$ImageReplyMessage.process(InitialImageOperation.java:2789)
        at 
org.apache.geode.distributed.internal.ReplyMessage.dmProcess(ReplyMessage.java:193)
        at 
org.apache.geode.distributed.internal.ReplyMessage.process(ReplyMessage.java:186)
        at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378)
        at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$5$1.run(ClusterDistributionManager.java:832)
        at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.
{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to