Dan Smith created GEODE-1248:
--------------------------------
Summary: gfsh shutdown command does not shutdown members waiting
for missing disk stores
Key: GEODE-1248
URL: https://issues.apache.org/jira/browse/GEODE-1248
Project: Geode
Issue Type: Bug
Components: persistence
Reporter: Dan Smith
The gfsh shutdown command fails to shutdown members that are in a state of
waiting for another member to recover the latest data. Instead, the shutdown
operation gets stuck waiting for a lock on the cache to shutdown the member.
Steps to reproduce.
1. Start a locator and two members
2. Create a REPLICATED_PERSISTENT region in gfsh
> create region --name="replicate" --type=REPLICATE_PERSISTEN
3. Do a put (probably not necessary)
> put --key="a" --value="a" --region=/replicate
4. shutdown within gfsh
> shutdown --include-locators=false
5. Start one member. It will get stuck waiting for other members to start.
6. shutdown within gfsh again.
> shutdown --include-locators=false
6. List members. You will see that the member is still up.
> list members
The end result after (6) is that the member is still up. In the stack dump, we
see the shutdown is blocked on the cache lock.
{noformat}
"Function Execution Processor1" #62 daemon prio=10 os_prio=0
tid=0x00007fe988013800 nid=0xf83a waiting for monitor entry [0x00007fe96e062000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
com.gemstone.gemfire.cache.CacheFactory.getAnyInstance(CacheFactory.java:292)
- waiting to lock <0x000000071f13e170> (a java.lang.Class for
com.gemstone.gemfire.cache.CacheFactory)
at
com.gemstone.gemfire.management.internal.cli.functions.ShutDownFunction.execute(ShutDownFunction.java:46)
at
com.gemstone.gemfire.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:194)
at
com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:379)
at
com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:450)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:655)
at
com.gemstone.gemfire.distributed.internal.DistributionManager$9$1.run(DistributionManager.java:1115)
at java.lang.Thread.run(Thread.java:745)
"main" #1 prio=5 os_prio=0 tid=0x00007fea0400a000 nid=0xf7dd in Object.wait()
[0x00007fea0afa4000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
com.gemstone.gemfire.internal.cache.persistence.PersistenceAdvisorImpl$MembershipChangeListener.waitForChange(PersistenceAdvisorImpl.java:1144)
- locked <0x000000078b067058> (a
com.gemstone.gemfire.internal.cache.persistence.PersistenceAdvisorImpl$MembershipChangeListener)
at
com.gemstone.gemfire.internal.cache.persistence.PersistenceAdvisorImpl.getInitialImageAdvice(PersistenceAdvisorImpl.java:875)
at
com.gemstone.gemfire.internal.cache.persistence.CreatePersistentRegionProcessor.getInitialImageAdvice(CreatePersistentRegionProcessor.java:55)
at
com.gemstone.gemfire.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1389)
at
com.gemstone.gemfire.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1217)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3153)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3047)
at
com.gemstone.gemfire.internal.cache.xmlcache.RegionCreation.createRoot(RegionCreation.java:262)
at
com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.initializeRegions(CacheCreation.java:555)
at
com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:528)
at
com.gemstone.gemfire.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:353)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4319)
at
com.gemstone.gemfire.internal.cache.ClusterConfigurationLoader.applyClusterConfiguration(ClusterConfigurationLoader.java:141)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.requestAndApplySharedConfiguration(GemFireCacheImpl.java:1020)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1161)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreate(GemFireCacheImpl.java:785)
at
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.create(GemFireCacheImpl.java:773)
at com.gemstone.gemfire.cache.CacheFactory.create(CacheFactory.java:178)
- locked <0x000000071f13e170> (a java.lang.Class for
com.gemstone.gemfire.cache.CacheFactory)
at com.gemstone.gemfire.cache.CacheFactory.create(CacheFactory.java:228)
- locked <0x000000071f13e170> (a java.lang.Class for
com.gemstone.gemfire.cache.CacheFactory)
at
com.gemstone.gemfire.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:55)
at
com.gemstone.gemfire.distributed.ServerLauncher.createCache(ServerLauncher.java:806)
at
com.gemstone.gemfire.distributed.ServerLauncher.start(ServerLauncher.java:726)
at
com.gemstone.gemfire.distributed.ServerLauncher.run(ServerLauncher.java:656)
at
com.gemstone.gemfire.distributed.ServerLauncher.main(ServerLauncher.java:207)
{noformat}
The shutdown command needs to somehow trigger shutdown even if the cache is in
the state during startup.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)