Hi all

The JCR installer was enhanced with a feature to pause it for a while
in SLING-3747. By pausing and later resuming the JCR installer a
"deployer" can signal to the installer that a set of installable
resources should be processed together.

The mechanism to pause the JCR installer is based on the presence of a
node in a particular location in the repository. This is a requirement
to allow the feature to work in a cluster, where the installers on all
instances need to be paused.

In SLING-5421 it is reported that this mechanism can lead to a
permanently paused JCR installer, most likely due to a crash/kill or a
premature shutdown/failure of the repository. The possibility of a
programming error was ruled out by inspecting the code of the
"deployer" (try/finally is used consistently).

Additional robustness comes with the cost of added complexity. E.g. to
allow deletion of a pause-marker, the marker needs to be annotated
with the Sling-ID of an instance. Otherwise another instance might
remove a valid pause-marker.

In order not to burden "deployer" implementations with this
complexity, I suggest encapsulating the logic within the installer
itself, and instead expose an API to pause the installer. This was the
consensus we found in some offline discussions.

(see also https://issues.apache.org/jira/browse/SLING-5421)

Any thoughts or objections?

Regards
Julian

Reply via email to