Hi all The JCR installer was enhanced with a feature to pause it for a while in SLING-3747. By pausing and later resuming the JCR installer a "deployer" can signal to the installer that a set of installable resources should be processed together.
The mechanism to pause the JCR installer is based on the presence of a node in a particular location in the repository. This is a requirement to allow the feature to work in a cluster, where the installers on all instances need to be paused. In SLING-5421 it is reported that this mechanism can lead to a permanently paused JCR installer, most likely due to a crash/kill or a premature shutdown/failure of the repository. The possibility of a programming error was ruled out by inspecting the code of the "deployer" (try/finally is used consistently). Additional robustness comes with the cost of added complexity. E.g. to allow deletion of a pause-marker, the marker needs to be annotated with the Sling-ID of an instance. Otherwise another instance might remove a valid pause-marker. In order not to burden "deployer" implementations with this complexity, I suggest encapsulating the logic within the installer itself, and instead expose an API to pause the installer. This was the consensus we found in some offline discussions. (see also https://issues.apache.org/jira/browse/SLING-5421) Any thoughts or objections? Regards Julian
