[ 
https://issues.apache.org/jira/browse/SLING-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571774#comment-16571774
 ] 

Robert Munteanu commented on SLING-7811:
----------------------------------------

Making the {{LoginAdminWhitelist}} require a configuration has the nice effect 
of locking up the whole initialisation process, which is I guess correct, but 
suboptimal :)

My understanding is that the following happens:
 * the configadmin bundle starts processing configurations and configuring 
components
 * the OakSlingRepositoryManager component is configured
 ** in the activate method, it calls {{AbstractSlingRepositoryManager#start}}
 ** in this method, we wait for the {{LoginAdminWhitelist}} service
 * since that component needs a configuration *and* we're blocking in the 
thread that configures component, we're effectively in a deadlock

I found 2 solutions that fix startup, but I don't think any is optimal
h1. Teach the {{AbstractSlingRepositoryManager}} to wait smarter for the login 
admin whitelist

I added a stricter way of waiting for the whitelist service, still in an async 
thread. I am now waiting for a configurable number of {{WhitelistFragment}} 
services and only then looking for the {{LoginWhitelistComponent}}. That, 
coupled with a proper shutdown handling of the init thread leads to a nicer 
(subjectively speaking) log entry:
{noformat}
07.08.2018 15:23:30.219 *INFO* [Apache Sling Repository Startup Thread #1] 
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager Interrupted 
while waiting for the LoginAdminWhitelist service, cancelling initialisation
java.lang.InterruptedException: null
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
        at 
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$ServiceTrackerBarrier.await(AbstractSlingRepositoryManager.java:700)
 [org.apache.sling.jcr.base:3.0.5.SNAPSHOT]
        at 
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$3.run(AbstractSlingRepositoryManager.java:451)
 [org.apache.sling.jcr.base:3.0.5.SNAPSHOT]
{noformat}
*Pros*:
 * fixes the error log
 * fix self-contained in the jcr.base bundle

*Cons*:
 * too much knowledge about the {{LoginAdminWhitelist}} baked into the 
{{AbstractSlingRepositoryManager}}.
 * usually the repository start fails once (interrupted) and then works

h1. Handle the extra requirement using declarative services

I tried making the {{LoginAdminWhitelist}} a dependency of the 
{{OakSlingRepositoryManager}}, but since it's an internal class I can't depend 
on it from the {{oak-server}} bundle.

I convinced the {{OakSlingRepositoryManager}} to wait for the 
{{LoginAdminWhitelist}} by adding a {{RepositoryRequirement}} marker interface 
to {{jcr.base}} and registering the login admin component under that service as 
well. Then, from the {{OakSlingRepositoryManager}} I added a dependency to 0..n 
{{RepositoryRequirement}} instances, and tweaked the cardinality requirements 
as follows:
{code:java}
  # SLING-7811 - wait until at all admin config fragments exist
  org.apache.sling.jcr.base.internal.LoginAdminWhitelist
    WhitelistFragment.cardinality.minimum=I"1"

  # SLING-7811 - wait until at all admin config fragments exist
  org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager
    prerequisites.cardinality.minimum=I"1"
{code}
In practice, this means that the {{LoginAdminWhitelist}} is registered after 
the first fragment and only then the {{OakSlingRepositoryManager}} is started.

Repository initialisation is still done in an async manner.

*Pros*:
 * streamlined startup, no cancellations
 * removes code for waiting on the login admin whitelist

*Cons*:
 * two components now have required configurations
 * the dependency is in the wrong place, the requirement is for the base class 
{{AbstractSlingRepositoryManager}} but it's expressed in the 
{{OakSlingRepositoryManager}}
 * adds API

Thoughts?

> NPE when repository is starting up
> ----------------------------------
>
>                 Key: SLING-7811
>                 URL: https://issues.apache.org/jira/browse/SLING-7811
>             Project: Sling
>          Issue Type: Bug
>          Components: JCR
>    Affects Versions: JCR Oak Server 1.1.4, JCR Base 3.0.4
>            Reporter: Carsten Ziegeler
>            Assignee: Robert Munteanu
>            Priority: Major
>             Fix For: JCR Base 3.0.6, JCR Oak Server 1.2.2
>
>
> With the latest Sling Starter, the following NPE occurs in the logs. It seems 
> to be harmless, nevertheless we should fix it:
> For now I assigned it to both, JCR Base and Oak Server, as it's unclear which 
> one it is. Interestingly we've released Oak Server 1.2.0 but are not using it 
> in the starter.
> {noformat}
> 06.08.2018 15:45:18.396 *ERROR* [Apache Sling Repository Startup Thread] 
> org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager start: 
> Uncaught Throwable trying to access Repository, calling stopRepository()
> java.lang.NullPointerException: null
>         at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:192) 
> [com.google.guava:15.0.0]
>         at org.apache.jackrabbit.oak.jcr.Jcr.with(Jcr.java:296) 
> [org.apache.jackrabbit.oak-jcr:1.6.8]
>         at 
> org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager.acquireRepository(OakSlingRepositoryManager.java:161)
>  [org.apache.sling.jcr.oak.server:1.1.4]
>         at 
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.initializeAndRegisterRepositoryService(AbstractSlingRepositoryManager.java:471)
>  [org.apache.sling.jcr.base:3.0.4]
>         at 
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.access$300(AbstractSlingRepositoryManager.java:85)
>  [org.apache.sling.jcr.base:3.0.4]
>         at 
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager$4.run(AbstractSlingRepositoryManager.java:455)
>  [org.apache.sling.jcr.base:3.0.4]
> {noformat}
> The stack trace points to a null workspace name ( see 
> https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.8/oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/Jcr.java#L296
>  ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to