Kenneth Howe created GEODE-1721:
-----------------------------------
Summary: Fail cache creation when started by XML file when there
is a colocated region configuration error that prevents full persistent PR
bucket recovery
Key: GEODE-1721
URL: https://issues.apache.org/jira/browse/GEODE-1721
Project: Geode
Issue Type: Improvement
Components: regions
Reporter: Kenneth Howe
This change was originally proposed on the geode-dev list. This ticket is a
placeholder for re-evaluating the proposal after GEODE-1117, dealing with
gateway senders that have manual-start=true, is fixed.
When persistent PRs are colocated, the parent region is created first, but
persistent data recovery isn’t done until all the colocated regions have been
created. Currently, if a child region is not created, the cache creation will
succeed but persistent data is not recovered. This is the condition reported in
the Jira ticket
When caches and regions are created via the APIs, or interactively with gfsh,
the cache is created, then the parent region(s), then the child region(s).
There will always be an unknown delay between each of these steps. The parent
region creation succeeds, but internally Geode does not know when (or if) the
child regions will be created. Normally the child regions are created after a
short period and recovery proceeds, so the parent region having unrecovered
data is a transitory state. If the child region is not created, the the parent
region data will not be recovered. In this case a warning can be logged if the
missing child regions aren’t created within a reasonable time.
However, when the cache creation is done via a cache.xml file, regions are
created as part of the cache creation. In this case it’s known fairly quickly
that there’s a misconfiguration that will prevent persistent PR recovery. The
cache creation can be failed immediately alerting the user to the
misconfiguration.
Note: Gateway sender queues are implemented internally as colocated child
regions. If manual-start is set to true in the cache.xml, then the cache is
created immediately, but the queue isn't created until the sender is started.
If the queue region was created when the cache is created even if the sender
isn't started, then this proposed change could be implemented
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)