[
https://issues.apache.org/jira/browse/GEODE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889230#comment-16889230
]
Darrel Schneider commented on GEODE-6896:
-----------------------------------------
I think the only case in which we will silently start a server without cluster
config (when it has use-shared-configuration=true) is if all our locators have
disabled cluster config. When gfsh starts a locator it has it default to
enable-shared-configuration=true.
So users would need to either not be using gfsh to start the locator, or
explicitly setting enable-shared-configuration=false when starting with gfsh to
run into this.
If we have at least one locator that has enable-shared-configuration=true then
GemFireCacheImpl.requestSharedConfiguration() will call
ccLoader.requestConfigurationFromLocators. It will try getting cluster config
from each locator. If that fails it sleeps for 10 seconds and then will retry 6
times. If after all those retries (at least 1 minute has elapsed) it does not
have a config it will throw ClusterConfigurationNotAvailableException which
requestSharedConfiguration turns into a GemFireConfigException that it throws.
So if we have a locator that can not provide us cluster config for some reason
then our cache creation will fail with GemFireConfigException.
So I think the case we want to address is when we go to load the cluster config
from a locator and no locators are running. If locators are running and the
have enable-shared-configuration=false then we should just log the message and
startup without cluster config. But if no locators are available we should
report a problem if we have use-shared-configuration=true.
So if getAllHostedLocatorsWithSharedConfiguration returns an empty set we could
also check if getAllHostedLocators is empty. If it is then a locator has
crashed and we should report an error if config.getUseSharedConfiguration()
> Member can startup without cluster configuration if locator crashes on startup
> ------------------------------------------------------------------------------
>
> Key: GEODE-6896
> URL: https://issues.apache.org/jira/browse/GEODE-6896
> Project: Geode
> Issue Type: Bug
> Components: management
> Reporter: Dan Smith
> Priority: Major
> Attachments: GEODE-6896-test.diff
>
>
> Even if a user is actually using cluster configuration, it's possible for a
> member to start up without using the cluster configuration.
> If a locator crashes after a member joins the distributed system, but before
> it gets the cluster configuration, the member will just log a message and
> start up without cluster configuration in this check in GemFireCacheImpl
> {noformat}
> if (locatorsWithClusterConfig.isEmpty()) {
> logger.info("No locator(s) found with cluster configuration service");
> return null;
> }
> {noformat}
> This is especially problematic if persistent PDX is used, because the member
> will start up with a non persistent PDX region which will prevent any other
> members with persistence from starting up.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)