Kenneth Howe created GEODE-1721:
-----------------------------------

             Summary: Fail cache creation when started by XML file when there 
is a colocated region configuration error that prevents full persistent PR 
bucket recovery
                 Key: GEODE-1721
                 URL: https://issues.apache.org/jira/browse/GEODE-1721
             Project: Geode
          Issue Type: Improvement
          Components: regions
            Reporter: Kenneth Howe


This change was originally proposed on the geode-dev list. This ticket is a 
placeholder for re-evaluating the proposal after GEODE-1117, dealing with 
gateway senders that have manual-start=true, is fixed.

When persistent PRs are colocated, the parent region is created first, but 
persistent data recovery isn’t done until all the colocated regions have been 
created. Currently, if a child region is not created, the cache creation will 
succeed but persistent data is not recovered. This is the condition reported in 
the Jira ticket

When caches and regions are created via the APIs, or interactively with gfsh, 
the cache is created, then the parent region(s), then the child region(s). 
There will always be an unknown delay between each of these steps. The parent 
region creation succeeds, but internally Geode does not know when (or if) the 
child regions will be created. Normally the child regions are created after a 
short period and recovery proceeds, so the parent region having unrecovered 
data is a transitory state. If the child region is not created, the the parent 
region data will not be recovered. In this case a warning can be logged if the 
missing child regions aren’t created within a reasonable time.

However, when the cache creation is done via a cache.xml file, regions are 
created as part of the cache creation. In this case it’s known fairly quickly 
that there’s a misconfiguration that will prevent persistent PR recovery. The 
cache creation can be failed immediately alerting the user to the 
misconfiguration.

Note: Gateway sender queues are implemented internally as colocated child 
regions. If manual-start is set to true in the cache.xml, then the cache is 
created immediately, but the queue isn't created until the sender is started. 
If the queue region was created when the cache is created even if the sender 
isn't started, then this proposed change could be implemented



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to