[
https://issues.apache.org/jira/browse/SLING-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571774#comment-16571774
]
Robert Munteanu commented on SLING-7811:
----------------------------------------
Making the {{LoginAdminWhitelist}} require a configuration has the nice effect
of locking up the whole initialisation process, which is I guess correct, but
suboptimal :)
My understanding is that the following happens:
* the configadmin bundle starts processing configurations and configuring
components
* the OakSlingRepositoryManager component is configured
** in the activate method, it calls {{AbstractSlingRepositoryManager#start}}
** in this method, we wait for the {{LoginAdminWhitelist}} service
* since that component needs a configuration *and* we're blocking in the
thread that configures component, we're effectively in a deadlock
I found 2 solutions that fix startup, but I don't think any is optimal
h1. Teach the {{AbstractSlingRepositoryManager}} to wait smarter for the login
admin whitelist
I added a stricter way of waiting for the whitelist service, still in an async
thread. I am now waiting for a configurable number of {{WhitelistFragment}}
services and only then looking for the {{LoginWhitelistComponent}}. That,
coupled with a proper shutdown handling of the init thread leads to a nicer
(subjectively speaking) log entry:
{noformat}
07.08.2018 15:23:30.219 *INFO* [Apache Sling Repository Startup Thread #1]
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager Interrupted
while waiting for the LoginAdminWhitelist service, cancelling initialisation
java.lang.InterruptedException: null
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$ServiceTrackerBarrier.await(AbstractSlingRepositoryManager.java:700)
[org.apache.sling.jcr.base:3.0.5.SNAPSHOT]
at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$3.run(AbstractSlingRepositoryManager.java:451)
[org.apache.sling.jcr.base:3.0.5.SNAPSHOT]
{noformat}
*Pros*:
* fixes the error log
* fix self-contained in the jcr.base bundle
*Cons*:
* too much knowledge about the {{LoginAdminWhitelist}} baked into the
{{AbstractSlingRepositoryManager}}.
* usually the repository start fails once (interrupted) and then works
h1. Handle the extra requirement using declarative services
I tried making the {{LoginAdminWhitelist}} a dependency of the
{{OakSlingRepositoryManager}}, but since it's an internal class I can't depend
on it from the {{oak-server}} bundle.
I convinced the {{OakSlingRepositoryManager}} to wait for the
{{LoginAdminWhitelist}} by adding a {{RepositoryRequirement}} marker interface
to {{jcr.base}} and registering the login admin component under that service as
well. Then, from the {{OakSlingRepositoryManager}} I added a dependency to 0..n
{{RepositoryRequirement}} instances, and tweaked the cardinality requirements
as follows:
{code:java}
# SLING-7811 - wait until at all admin config fragments exist
org.apache.sling.jcr.base.internal.LoginAdminWhitelist
WhitelistFragment.cardinality.minimum=I"1"
# SLING-7811 - wait until at all admin config fragments exist
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager
prerequisites.cardinality.minimum=I"1"
{code}
In practice, this means that the {{LoginAdminWhitelist}} is registered after
the first fragment and only then the {{OakSlingRepositoryManager}} is started.
Repository initialisation is still done in an async manner.
*Pros*:
* streamlined startup, no cancellations
* removes code for waiting on the login admin whitelist
*Cons*:
* two components now have required configurations
* the dependency is in the wrong place, the requirement is for the base class
{{AbstractSlingRepositoryManager}} but it's expressed in the
{{OakSlingRepositoryManager}}
* adds API
Thoughts?
> NPE when repository is starting up
> ----------------------------------
>
> Key: SLING-7811
> URL: https://issues.apache.org/jira/browse/SLING-7811
> Project: Sling
> Issue Type: Bug
> Components: JCR
> Affects Versions: JCR Oak Server 1.1.4, JCR Base 3.0.4
> Reporter: Carsten Ziegeler
> Assignee: Robert Munteanu
> Priority: Major
> Fix For: JCR Base 3.0.6, JCR Oak Server 1.2.2
>
>
> With the latest Sling Starter, the following NPE occurs in the logs. It seems
> to be harmless, nevertheless we should fix it:
> For now I assigned it to both, JCR Base and Oak Server, as it's unclear which
> one it is. Interestingly we've released Oak Server 1.2.0 but are not using it
> in the starter.
> {noformat}
> 06.08.2018 15:45:18.396 *ERROR* [Apache Sling Repository Startup Thread]
> org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager start:
> Uncaught Throwable trying to access Repository, calling stopRepository()
> java.lang.NullPointerException: null
> at
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:192)
> [com.google.guava:15.0.0]
> at org.apache.jackrabbit.oak.jcr.Jcr.with(Jcr.java:296)
> [org.apache.jackrabbit.oak-jcr:1.6.8]
> at
> org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager.acquireRepository(OakSlingRepositoryManager.java:161)
> [org.apache.sling.jcr.oak.server:1.1.4]
> at
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.initializeAndRegisterRepositoryService(AbstractSlingRepositoryManager.java:471)
> [org.apache.sling.jcr.base:3.0.4]
> at
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.access$300(AbstractSlingRepositoryManager.java:85)
> [org.apache.sling.jcr.base:3.0.4]
> at
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager$4.run(AbstractSlingRepositoryManager.java:455)
> [org.apache.sling.jcr.base:3.0.4]
> {noformat}
> The stack trace points to a null workspace name ( see
> https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.8/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/Jcr.java#L296
> ).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)