Hi all, After I restart the geode cluster having a region of type *PARTITION_REDUNDANT_PERSISTENT*, the following are seen in the logs
*server-1 logs* .......................................................... Region /ukCustomers (and any colocated sub-regions) has potentially stale data. Buckets [1, 6, 8] are waiting for another offline member to recover the latest data. My persistent id is: DiskStore ID: 30596906-c97c-4279-89ea-46d088ed27f6 Name: stay-wrong-zeta Location: /10.1.2.28:/scripts/data-1 Offline members with potentially new data:[ DiskStore ID: f4d5a2f6-7254-4749-ba9f-1831d8215634 Location: /10.1.2.22:/scripts/data-4 Buckets: [1, 6, 8] ]Use the gfsh show missing-disk-stores command to see all disk stores that are being waited on by other members. .......... Region /ukCustomers has successfully completed waiting for other members to recover the latest data. My persistent member information: DiskStore ID: 30596906-c97c-4279-89ea-46d088ed27f6 Name: stay-wrong-zeta Location: /10.1.2.28:/scripts/data-1 ................ Server in /stay-wrong-zeta on server-1-7bfcbd6c7b-b54wb[40404] as stay-wrong-zeta is currently online. Process ID: 23 Uptime: 1 minute 48 seconds Geode Version: 1.11.0 Java Version: 1.8.0_212 Log File: /stay-wrong-zeta/stay-wrong-zeta.log JVM Arguments: -Dgemfire.locators=locator-1[10334],locator-2[10334] -Dgemfire.start-dev-rest-api=false -Dgemfire.use-cluster-configuration=true -Dgemfire.cache-xml-file=/scripts/cache-1.xml -Dgemfire.log-level=error -Xms512m -Xmx512m -XX:+UseG1GC -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 Class-Path: /geode/lib/geode-core-1.11.0.jar:/scripts/classpath/domain.jar:/scripts/classpath/spatial4j-0.7.jar:/scripts/classpath/geode-configs.jar:/scripts/classpath/lucene-sandbox-6.6.2.jar:/geode/lib/geode-dependencies.jar *server-2 logs* ................................... Region /ukCustomers (and any colocated sub-regions) has potentially stale data. Buckets [0, 1, 3] are waiting for another offline member to recover the latest data.My persistent id is: DiskStore ID: 2455d3c8-d852-4dac-a743-25ae62f5892c Name: kick-drab-bat Location: /10.1.2.30:/scripts/data-2 Offline members with potentially new data:[ DiskStore ID: f4d5a2f6-7254-4749-ba9f-1831d8215634 Location: /10.1.2.22:/scripts/data-4 Buckets: [0, 1, 3] ]Use the gfsh show missing-disk-stores command to see all disk stores that are being waited on by other members. .......... Region /ukCustomers has successfully completed waiting for other members to recover the latest data.My persistent member information: DiskStore ID: 2455d3c8-d852-4dac-a743-25ae62f5892c Name: kick-drab-bat Location: /10.1.2.30:/scripts/data-2 .............. Server in /kick-drab-bat on server-2-9cbbd877c-gl6c4[40405] as kick-drab-bat is currently online. Process ID: 23 Uptime: 1 minute 15 seconds Geode Version: 1.11.0 Java Version: 1.8.0_212 Log File: /kick-drab-bat/kick-drab-bat.log JVM Arguments: -Dgemfire.locators=locator-1[10334],locator-2[10334] -Dgemfire.start-dev-rest-api=false -Dgemfire.use-cluster-configuration=true -Dgemfire.cache-xml-file=/scripts/cache-2.xml -Dgemfire.log-level=error -Xms512m -Xmx512m -XX:+UseG1GC -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 Class-Path: /geode/lib/geode-core-1.11.0.jar:/scripts/classpath/domain.jar:/scripts/classpath/spatial4j-0.7.jar:/scripts/classpath/geode-configs.jar:/scripts/classpath/lucene-sandbox-6.6.2.jar:/geode/lib/geode-dependencies.jar When I try to connect to the geode server using *client-cache*, it throws an error org.apache.geode.cache.client.NoAvailableServersException: null at org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:277) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:125) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:772) at org.apache.geode.cache.client.internal.PutAllOp.execute(PutAllOp.java:100) at org.apache.geode.cache.client.internal.ServerRegionProxy.putAll(ServerRegionProxy.java:592) at org.apache.geode.internal.cache.LocalRegion.basicPutAll(LocalRegion.java:8913) at org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:8846) at org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:8858) . . . . . . . . . However, telnet <<remote hostname>> 40404 works fine. *What has gone wrong ?* *client-cache.xml* is as follows: <?xml version="1.0" encoding="UTF-8"?> <client-cache> <pool name="writeCachePool"> <server host="${server1.url}" port="${server1.port}"/> <server host="${server2.url}" port="${server2.port}"/> </pool> <region name="ukCustomers" refid="PROXY"/> </client-cache> server 1 is re-started using the command args: ["gfsh", "start", "server", "--locators=locator-1[10334],locator-2[10334]", "--rebalance=true","--server-port=40404", "--log-level=error", "--J=-Xms512m", "--J=-Xmx512m", "--J=-XX:+UseG1GC", "--classpath=/scripts/classpath/domain.jar", "--cache-xml-file=/scripts/cache-1.xml"] where cache-1.xml is as follows: <?xml version="1.0" encoding="UTF-8"?> <cache version="1.0" is-server="true"> <disk-store name="disk-store-1" compaction-threshold="40" max-oplog-size="1024" queue-size="10000" time-interval="2000" write-buffer-size="65536" disk-usage-warning-percentage="80" disk-usage-critical-percentage="98"> <disk-dirs> <disk-dir>/scripts/data-1</disk-dir> </disk-dirs> </disk-store> <region name="ukCustomers" refid="PARTITION_REDUNDANT_PERSISTENT"> <region-attributes data-policy="persistent-partition" disk-store-name="disk-store-1" statistics-enabled="true" disk-synchronous="true"> <partition-attributes redundant-copies="1" recovery-delay="5000" startup-recovery-delay="5000"/> </region-attributes> </region> </cache> server 2 is also restarted in the similar manner with cache-2.xml. However for cache-2.xml, dish-dir would be /scripts/data-2 & disk-store-name="disk-store-2"