Re: SLING-5421 - Allow JCR installer to recover from being paused indefinitely

Carsten Ziegeler Mon, 01 Feb 2016 05:43:36 -0800

Julian Sedding wrote
> Hi Carsten
> 
> Thanks for your comments. I agree that it would be nice if we could
> avoid pausing the installer altogether.


If I remember correctly, that's what we said when we added the current
pausing solution: this pausing solution is temporary until we change the
package installer to use the OSGi installer :)

> 
> However, I see some challenges:
> - How do we make sure all nodes in a cluster install the bundles into
> their OSGi environments in a single batch?

Ok, good point - right. Well if a content package would be installed
with a single save this would be easy :)

> - Currently content packages contain bundles that are installed into
> the repository. How could we prevent duplicate installation (by the
> JCR installer triggered via observation and directly by the OSGi
> installer)?

If the content package would be installed through the installer,
the bundles would still be installed through observation. But due to the
single thread after all content is installed.

> 
> Do you think it is realistic to solve these issues in the short term?

Not sure, however changing the mechanism as suggested doesn't sound so
easy to me either.
> 
> Even if we can solve them, we will still need reliable
> communication/coordination between cluster nodes. This part, as
> Bertrand suggested in the issue, could be made generic. AFAIK
> ZooKeeper, etcd et al. provide such mechanisms. Maybe we need to
> provide an implementation agnostic API for this in the discovery
> module.

Well, a lot of things are doable - but starting at the real problem and
ending up with such a massive solution spreading across bundles/apis
doesn't sound appealing to me. I would rather spent the energy and think
about what is the best way to update an installation and derive a
possible solution from there. Especially with containers like docker
instances are not updated but simply a new instance with the new
configuration is started. Therefore I'm wondering if we should really go
this far and add all these things all over the place.

I think for now, the immediate issue to resolve is to recover from being
paused indefinitely. Simplest solution is to require clients to add a
timestamp to the node they create and the node will be removed after a
(long) timeout.

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org

Re: SLING-5421 - Allow JCR installer to recover from being paused indefinitely

Reply via email to