[ 
https://issues.apache.org/jira/browse/GEODE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889230#comment-16889230
 ] 

Darrel Schneider commented on GEODE-6896:
-----------------------------------------

I think the only case in which we will silently start a server without cluster 
config (when it has use-shared-configuration=true) is if all our locators have 
disabled cluster config. When gfsh starts a locator it has it default to 
enable-shared-configuration=true.

So users would need to either not be using gfsh to start the locator, or 
explicitly setting enable-shared-configuration=false when starting with gfsh to 
run into this.

If we have at least one locator that has enable-shared-configuration=true then 
GemFireCacheImpl.requestSharedConfiguration() will call 
ccLoader.requestConfigurationFromLocators. It will try getting cluster config 
from each locator. If that fails it sleeps for 10 seconds and then will retry 6 
times. If after all those retries (at least 1 minute has elapsed) it does not 
have a config it will throw ClusterConfigurationNotAvailableException which 
requestSharedConfiguration turns into a GemFireConfigException that it throws. 
So if we have a locator that can not provide us cluster config for some reason 
then our cache creation will fail with GemFireConfigException.

So I think the case we want to address is when we go to load the cluster config 
from a locator and no locators are running. If locators are running and the 
have enable-shared-configuration=false then we should just log the message and 
startup without cluster config. But if no locators are available we should 
report a problem if we have use-shared-configuration=true.

So if getAllHostedLocatorsWithSharedConfiguration returns an empty set we could 
also check if getAllHostedLocators is empty. If it is then a locator has 
crashed and we should report an error if config.getUseSharedConfiguration()

> Member can startup without cluster configuration if locator crashes on startup
> ------------------------------------------------------------------------------
>
>                 Key: GEODE-6896
>                 URL: https://issues.apache.org/jira/browse/GEODE-6896
>             Project: Geode
>          Issue Type: Bug
>          Components: management
>            Reporter: Dan Smith
>            Priority: Major
>         Attachments: GEODE-6896-test.diff
>
>
> Even if a user is actually using cluster configuration, it's possible for a 
> member to start up without using the cluster configuration.
> If a locator crashes after a member joins the distributed system, but before 
> it gets the cluster configuration, the member will just log a message and 
> start up without cluster configuration in this check in GemFireCacheImpl
> {noformat}
> if (locatorsWithClusterConfig.isEmpty()) {
>       logger.info("No locator(s) found with cluster configuration service");
>       return null;
>     }
> {noformat}
> This is especially problematic if persistent PDX is used, because the member 
> will start up with a non persistent PDX region which will prevent any other 
> members with persistence from starting up.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to