Jens Deppe created GEODE-1134:
---------------------------------

             Summary: Stale Cluster Configuration Service
                 Key: GEODE-1134
                 URL: https://issues.apache.org/jira/browse/GEODE-1134
             Project: Geode
          Issue Type: Bug
          Components: configuration
            Reporter: Jens Deppe


There seems to be an issue with the cluster configuration service, for which 
manual modifications to the "cluster.xml" and/or "cluster.properties" files are 
only picked up by the servers when the ENTIRE cluster is restarted.

The official user guide states the following: "If you make any manual 
modifications to the cluster.xml or cluster.properties (or group_name.xml or 
group_name.properties) files, you must stop the locator and then restart the 
locator using the --load-cluster-configuration-from-dir parameter. Direct file 
modifications are not picked up by the cluster configuration service without a 
locator restart.". So basically you should be able to restart the members in a 
rolling fashion, as long as the locators are restarted at first and they 
correctly pick up the new cluster configuration files from disk, the servers 
should have the new cluster configuration once they are restarted afterwards. 

This doesn't seem to be case according to some tests I've done.
Basically, customer's requirement is to be able to manually modify the 
cluster.xml file without downtime, meaning that are okay with restarting the 
members one at a time, but not all of them at the same time. They can't use 
gfsh scripts to make these modifications, they must be able to manually modify 
the cluster.xml, that's their requirement.
For some reason is always required to stop the entire cluster (locators and 
servers); if you don't, then the servers won't get the new cluster 
configuration. This can be reproduced in every run (with one, two and three 
locators, it doesn't matter). The reproducible scenario is attached to the 
JIRA, the steps are below:
{code:borderStyle=solid}
1. Download and extract the file "workspace.zip".
2. Modify the file "00_setenv.txt", specifically the variables "JAVA_HOME" and 
"GEMFIRE" to use your local installation directories.
3. Execute "01_start_cluster.sh" (start N locators and M servers, being N and M 
variables defined in "00_setenv.txt").
4. Execute "02_configure_cluster.sh" (creates two regions and one index, just 
for testing purposes).
5. Execute "03_change_cluster_config.sh". The main goal of this file is to 
replace the "cluster.xml" file with another one (located in 
GemFire/cluster/config/cluster.xml), and restart the members in different 
orders to verifiy whether the new configuration has been picked up by the 
servers or not. After each selection you can choose the option "6" to verify 
the cluster configuration. As you can see, only option 5 (shutdown the entire 
cluster) works correctly.
6. Execute "04_stop_cluster.sh" and "05_clean_cluster.sh" to delete everything.
{code}

This might be a documentation bug but I don't think so, if the cluster 
configuration is only stored in locators, why do the options 2 and 4 not work?.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to