Have there been any changes within the last year involving the following? Is anyone familiar with getOrCreateDefaultDiskStore and when it's invoked?
When Region creation for a persistent Region does not specify a disk store, the code call getOrCreateDefaultDiskStore. It then proceeds to create the default disk store. In December, this began to cause a dead lock ( https://issues.apache.org/jira/browse/GEODE-6255) in a persistent region test that has two threads: * Thread-1 is invoking Cache.close() * Thread-2 is invoking Region creation for a persistent Region that will use the default Disk Store which has not yet been created Thread-1 (close) is acquiring locks in this order: synchronization on GemFireCacheImpl.rootRegions, MangementListener.writeLock Thread-2 (create region) is acquiring locks in this order: a) (for Region) synchronization on GemFireCacheImpl.rootRegions, MangementListener.readLock b) (for Disk Store) MangementListener.readLock, synchronization on GemFireCacheImpl.rootRegions Step (b) is what causes the deadlock and this only occurs if the default disk store needs to be created for the newly created region. ManagementListener is creating JMX mbeans for the whatever component was just created. I filed a ticket for the deadlock: https://issues.apache.org/jira/browse/GEODE-6255 (not sure I used "thread-1" and "thread-2" consistently between this email and the ticket -- I may have flipped them around). I think creating the default disk store before creating the region might be the only easy way to fix the bug. My pair already tried changing ManagementListener to use a dedicated thread (or thread pool). We also tried removing the ReadWriteLock to see what it's actually protecting and the failures are more complicated than creating the default disk store before creating the region. "thread-1": at org.apache.geode.internal.cache.GemFireCacheImpl.removeRoot(GemFireCacheImpl.java:3577) - waiting to lock <0x0000000773583c28> (a java.util.HashMap) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6333) at org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1755) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6255) at org.apache.geode.internal.cache.LocalRegion.localDestroyRegion(LocalRegion.java:2242) at org.apache.geode.internal.cache.AbstractRegion.localDestroyRegion(AbstractRegion.java:430) at org.apache.geode.management.internal.ManagementResourceRepo.destroyLocalMonitoringRegion(ManagementResourceRepo.java:73) at org.apache.geode.management.internal.LocalManager.cleanUpResources(LocalManager.java:260) at org.apache.geode.management.internal.LocalManager.stopManager(LocalManager.java:388) at org.apache.geode.management.internal.SystemManagementService.close(SystemManagementService.java:239) - locked <0x000000077361b900> (a java.util.HashMap) at org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheRemoval(ManagementAdapter.java:737) at org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:119) at org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201) at org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2127) - locked <0x00000006c010d508> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1966) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1956) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.closeCache(CreateDestroyRegionRegressionTest.java:119) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$1(CreateDestroyRegionRegressionTest.java:93) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$3/1456208737.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) "thread-2": at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000007735ff8e0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) at org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:110) at org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201) at org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606) at org.apache.geode.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:144) - locked <0x0000000773583ac8> (a org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.getOrCreateDefaultDiskStore(GemFireCacheImpl.java:2566) - locked <0x0000000773583ac8> (a org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.internal.cache.LocalRegion.findDiskStore(LocalRegion.java:7600) at org.apache.geode.internal.cache.LocalRegion.<init>(LocalRegion.java:647) at org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3023) - locked <0x0000000773583c28> (a java.util.HashMap) at org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2956) at org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2944) at org.apache.geode.cache.RegionFactory.create(RegionFactory.java:755) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.createRegionWithDefaultDiskStore(CreateDestroyRegionRegressionTest.java:105) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest.lambda$hang$0(CreateDestroyRegionRegressionTest.java:92) at org.apache.geode.internal.cache.persistence.CreateDestroyRegionRegressionTest$$Lambda$2/901506536.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Found 1 deadlock.