[ https://issues.apache.org/jira/browse/GEODE-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15783991#comment-15783991 ]
Anthony Baker commented on GEODE-2240: -------------------------------------- I think this fix is causing a deadlock, see the thread dump below: {code} vm_6_persist5_ceabe8ff-1c19-4ac0-5ed1-c77f9d8b22ee_30240:vm_6_thr_17_persist5_ceabe8ff-1c19-4ac0-5ed1-c77f9d8b22ee_30240 ID=0x1f(31) state=TIMED_WAITING waiting to lock <java.util.concurrent.locks.ReentrantLock$NonfairSync@3be415f9> at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247) at java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:442) at org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lockInterruptibly(StoppableReentrantLock.java:88) at org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lock(StoppableReentrantLock.java:71) at org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.lockQueueHead(TombstoneService.java:845) at org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.removeUnexpiredIf(TombstoneService.java:801) at org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.access$000(TombstoneService.java:718) at org.apache.geode.internal.cache.TombstoneService.gcTombstones(TombstoneService.java:221) locked <java.lang.Object@e63a28> at org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:512) at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1307) at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1101) at org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:959) at org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:824) at diskRecovery.RecoveryTest.createSubregions(RecoveryTest.java:3036) at diskRecovery.RecoveryTest.createRegionHier(RecoveryTest.java:2990) at diskRecovery.RecoveryTest.HydraTask_initialize(RecoveryTest.java:245) locked <java.lang.Class@45829aeb> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at hydra.MethExecutor.execute(MethExecutor.java:182) at hydra.MethExecutor.execute(MethExecutor.java:150) at hydra.TestTask.execute(TestTask.java:192) at hydra.RemoteTestModule$1.run(RemoteTestModule.java:212) vm_6_persist5_ceabe8ff-1c19-4ac0-5ed1-c77f9d8b22ee_30240:Replicate/Partition Region Garbage Collector ID=0x54(84) state=BLOCKED waiting to lock <java.lang.Object@e63a28> at org.apache.geode.internal.cache.TombstoneService$ReplicateTombstoneSweeper.expireTombstone(TombstoneService.java:672) at org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.checkOldestUnexpired(TombstoneService.java:982) at org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.run(TombstoneService.java:881) at java.lang.Thread.run(Thread.java:745) Locked synchronizers: java.util.concurrent.locks.ReentrantLock$NonfairSync@3be415f9 {code} > unexpected NullPointerException from Tombstone service > ------------------------------------------------------ > > Key: GEODE-2240 > URL: https://issues.apache.org/jira/browse/GEODE-2240 > Project: Geode > Issue Type: Bug > Components: regions > Reporter: Darrel Schneider > Assignee: Darrel Schneider > Fix For: 1.1.0 > > > A test failed and the logs were found to be full of NPEs from the tombstone > service:[severe 2016/12/20 02:04:35.605 UTC > dataStoregemfire7_rs-StorageBTTest-2016-12-19-23-35-42-client-14_19508 > <Replicate/Partition Region Garbage Collector> tid=0x44] GemFire garbage > collection service encountered an unexpected exception > java.lang.NullPointerException > at > org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.lambda$purgeObsoleteTombstones$1(TombstoneService.java:938) > at > org.apache.geode.internal.cache.TombstoneService$ReplicateTombstoneSweeper.removeExpiredIf(TombstoneService.java:479) > at > org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.removeIf(TombstoneService.java:823) > at > org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.purgeObsoleteTombstones(TombstoneService.java:937) > at > org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.run(TombstoneService.java:880) > at java.lang.Thread.run(Thread.java:745) > [severe 2016/12/20 02:05:45.987 UTC > dataStoregemfire7_rs-StorageBTTest-2016-12-19-23-35-42-client-14_19508 > <Replicate/Partition Region Garbage Collector> tid=0x44] GemFire garbage > collection service encountered an unexpected exception > java.lang.NullPointerException > at > org.apache.geode.internal.cache.TombstoneService$ReplicateTombstoneSweeper.expireBatch(TombstoneService.java:524) > at > org.apache.geode.internal.cache.TombstoneService$ReplicateTombstoneSweeper.checkExpiredTombstoneGC(TombstoneService.java:594) > at > org.apache.geode.internal.cache.TombstoneService$TombstoneSweeper.run(TombstoneService.java:878) > at java.lang.Thread.run(Thread.java:745) > Both of these stacks indicate that the "expiredTombstones" ArrayList somehow > has nulls in it. It is an ArrayList of Tombstone instances and the only code > that adds to it first tests that the item it is adding is not null. The only > other modify operation done on it is to remove an item. > Perhaps unsafe concurrent access is happening causing this code to see nulls. -- This message was sent by Atlassian JIRA (v6.3.4#6332)