Have there been any changes within the last year involving the following?
Is anyone familiar with getOrCreateDefaultDiskStore and when it's invoked?

When Region creation for a persistent Region does not specify a disk store,
the code call getOrCreateDefaultDiskStore. It then proceeds to create the
default disk store.

In December, this began to cause a dead lock (
https://issues.apache.org/jira/browse/GEODE-6255) in a persistent region
test that has two threads:

* Thread-1 is invoking Cache.close()
* Thread-2 is invoking Region creation for a persistent Region that will
use the default Disk Store which has not yet been created

Thread-1 (close) is acquiring locks in this order: synchronization on
GemFireCacheImpl.rootRegions, MangementListener.writeLock

Thread-2 (create region) is acquiring locks in this order:
a) (for Region) synchronization on GemFireCacheImpl.rootRegions,
MangementListener.readLock
b) (for Disk Store) MangementListener.readLock, synchronization on
GemFireCacheImpl.rootRegions

Step (b) is what causes the deadlock and this only occurs if the default
disk store needs to be created for the newly created region.

ManagementListener is creating JMX mbeans for the whatever component was
just created.

I filed a ticket for the deadlock:
https://issues.apache.org/jira/browse/GEODE-6255 (not sure I used
"thread-1" and "thread-2" consistently between this email and the ticket --
I may have flipped them around).

I think creating the default disk store before creating the region might be
the only easy way to fix the bug. My pair already tried changing
ManagementListener to use a dedicated thread (or thread pool). We also
tried removing the ReadWriteLock to see what it's actually protecting and
the failures are more complicated than creating the default disk store
before creating the region.

"thread-1":
at
org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577)
- waiting to lock <0x0000000773583c28> (a java.util.HashMap)
at
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333)
at
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755)
at
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255)
at
org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242)
at
org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430)
at
org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73)
at
org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260)
at
org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388)
at
org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239)
- locked <0x000000077361b900> (a java.util.HashMap)
at
org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737)
at
org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2127)
- locked <0x00000006c010d508> (a java.lang.Class for
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1966)
at
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1956)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.closeCache(CreateDestroyRegionRegressionTest.java:119)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$1(CreateDestroyRegionRegressionTest.java:93)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$3/1456208737.run(Unknown
Source)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

"thread-2": at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000007735ff8e0> (a
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at
org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:110)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
at
org.apache.geode.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:144)
- locked <0x0000000773583ac8> (a
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.GemFireCacheImpl.getOrCreateDefaultDiskStore(GemFireCacheImpl.java:2566)
- locked <0x0000000773583ac8> (a
org.apache.geode.internal.cache.GemFireCacheImpl)
at
org.apache.geode.internal.cache.LocalRegion.findDiskStore(LocalRegion.java:7600)
at org.apache.geode.internal.cache.LocalRegion.<init>(LocalRegion.java:647)
at
org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3023)
- locked <0x0000000773583c28> (a java.util.HashMap)
at
org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2956)
at
org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2944)
at org.apache.geode.cache.RegionFactory.create(RegionFactory.java:755)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.createRegionWithDefaultDiskStore(CreateDestroyRegionRegressionTest.java:105)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$0(CreateDestroyRegionRegressionTest.java:92)
at
org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$2/901506536.run(Unknown
Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.

Reply via email to