I tried to reproduce this but I could not, unfortunately. What I did
was

1. Build the latest Sling Starter locally
2. Started up MongoDB

$ docker run --name mongo-sling -p 27017:27017 -d mongo:3.6

3. Started up Sling
$ java -jar target/org.apache.sling.starter-12-SNAPSHOT.jar 
-Dsling.run.modes=oak_mongo

4. Waited for Sling to start, clicked around a bit and then shut it
down with Ctrl-C.

5. I started up Sling again, with the same command as the one from Step
3. 

Sling started up successfully, no errors in the log.

Do you have some simplified steps to reproduce I can try? Sharing a k8s
resource file is fine as well, as long as it is self-contained and I
can `kubectl apply -f` it and then follow some directions to get Sling
broken.

Thanks,
Robert


On Fri, 2020-02-07 at 16:58 -0500, Carlos Munoz wrote:
> Thanks Robert. We tried ensuring only a single Sling pod was hitting
> the
> database at one time with some strange results:
> 
> The first time it runs (against an empty database) everything goes
> well:
> the database is populated and the pod comes up with no issues.
> 
> We then bring this pod down, and then try to bring the same exact one
> up
> again with the original exception popping up again:
> 
> 29.01.2020 02:58:59.571 *ERROR* [Apache Sling Repository Startup
> Thread #4]
> ERROR: Bundle '160' EventDispatcher: Error during dispatch.
> (org.apache.sling.api.SlingException: Can't create the JCR event
> listener.)
> org.apache.sling.api.SlingException: Can't create the JCR event
> listener.
> ...
> ...
> Caused by: javax.jcr.LoginException: Can neither derive user name nor
> principal names for bundle org.apache.sling.jcr.resource [154] and
> sub
> service observation
> at
> org.apache.sling.jcr.base.AbstractSlingRepository2.loginService(Abstr
> actSlingRepository2.java:387)
> 
> I wonder if the sling pod is leaving the database in an unusable
> state when
> being brought down.
> 
> Regards,
> 
> Carlos
> 
> On Thu, Feb 6, 2020 at 4:11 AM Robert Munteanu <romb...@apache.org>
> wrote:
> 
> > On Wed, 2020-02-05 at 21:17 -0500, Carlos Munoz wrote:
> > > Hi all,
> > > 
> > > I think I have a theory for our issues here, and it may have to
> > > do
> > > with the
> > > fact that we are running on a heavily containerized environment
> > > (kubernetes). I wanted to consult with the community experts to
> > > see
> > > what
> > > you thought.
> > > 
> > > The way our container platform works on an update is that it will
> > > try
> > > to
> > > bring up a new container with sling (and our application) against
> > > the
> > > same
> > > mongo database that an original (and still running) container is
> > > running
> > > against. Now this works fine when the only thing being updated is
> > > our
> > > application bundle, but it starts encountering problems when
> > > several
> > > other
> > > bundles and configurations are being updated (some removed, some
> > > added,
> > > some updated). I *think* the core of the problem here is that the
> > > bundles
> > > and configurations are all stored in the database itself, and two
> > > containers with potentially different bundle versions and
> > > configurations
> > > are attempting to use it simultaneously.
> > 
> > That is a pretty good guess I'd say :-)
> > 
> > I did see some similar problems when using Sling for development
> > purposes on k8s. I never went to production with it, but for my own
> > purposes it was enough to ensure that only one Sling pod was
> > starting
> > up at a time. Maybe you can try that as well?
> > 
> > A more involved solution would be to use the CompositeNodeStore
> > [1],
> > which is designed to separate the storage of /libs and /apps from
> > the
> > rest of the repository. So for instance you'd have /libs and /apps
> > stored on a local segment store for each pod, and the rest of the
> > content in Mongo.
> > 
> > Unfortunately there is very little documentation and no tooling
> > around
> > it available, so that makes it a difficult proposition.
> > 
> > Thanks,
> > Robert
> > 
> > 
> > [1]: 
> > https://jackrabbit.apache.org/oak/docs/nodestore/compositens.html
> > 
> > > If I am right, then our core problem to figure out is how to
> > > upgrade
> > > a
> > > database from one sling version to the next.
> > > 
> > > Let me know what you all think.
> > > 
> > > Regards,
> > > 
> > > Carlos
> > > 
> > > On Tue, Feb 4, 2020 at 7:06 AM Carlos Munoz <camu...@redhat.com>
> > > wrote:
> > > 
> > > > Thanks Bertrand! I will continue my fact finding mission here
> > > > :)
> > > > 
> > > > Regards,
> > > > 
> > > > Carlos
> > > > 
> > > > On Tue, Feb 4, 2020 at 4:31 AM Bertrand Delacretaz <
> > > > bdelacre...@apache.org>
> > > > wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > On Sun, Feb 2, 2020 at 4:50 AM Carlos Munoz <
> > > > > camu...@redhat.com>
> > > > > wrote:
> > > > > > ...do configurations from the
> > > > > > repoinit files get installed in a specific order with
> > > > > > relation
> > > > > > to the
> > > > > > artifacts?...
> > > > > 
> > > > > The repoinit configs are applied by a single
> > > > > SlingRepositoryInitializer [1] service which is implemented
> > > > > by
> > > > > org.apache.sling.jcr.repoinit.impl.RepositoryInitializer [2].
> > > > > 
> > > > > The execution order of the SlingRepositoryInitializer
> > > > > services is
> > > > > based on their service rankings [4] and the
> > > > > RepositoryInitializer
> > > > > processes its configurations in the order in which they are
> > > > > provided
> > > > > by the OSGi framework, sequentially.
> > > > > 
> > > > > All this happens before the SlingRepository service is made
> > > > > available [3]
> > > > > 
> > > > > The logs should help understand what's going on but IIRC it
> > > > > all
> > > > > happens in a single thread.
> > > > > 
> > > > > -Bertrand
> > > > > 
> > > > > [1]
> > > > > 
> > https://sling.apache.org/documentation/bundles/repository-initialization.html
> > > > > [2]
> > > > > 
> > https://github.com/apache/sling-org-apache-sling-jcr-repoinit/blob/master/src/main/java/org/apache/sling/jcr/repoinit/impl/RepositoryInitializer.java
> > > > > [3]
> > > > > 
> > https://github.com/apache/sling-org-apache-sling-jcr-base/blob/e8fe5e004b5af1802bb2a76dbbb583a437f848ee/src/main/java/org/apache/sling/jcr/base/AbstractSlingRepositoryManager.java#L511
> > > > > [4]
> > > > > 
> > https://github.com/apache/sling-org-apache-sling-jcr-base/blob/e8fe5e004b5af1802bb2a76dbbb583a437f848ee/src/main/java/org/apache/sling/jcr/base/AbstractSlingRepositoryManager.java#L581
> > > > > 

Reply via email to