[
https://issues.apache.org/jira/browse/SLING-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800473#comment-13800473
]
Chetan Mehrotra commented on SLING-3189:
----------------------------------------
>From seeing the thread dump I think following is happening
# Sling installer installs all bundles at a given Start Level like 1
# At start level 1 we have various bundles like Commons Log and Jetty
# Jetty has an optional dependency on SLF4J
# Once all the bundles are resolved the bundles are started. In some cases
Jetty bundle gets started first.
# If Jetty gets started first it triggers SLF4J initialization in a separate
thread (diff from main Framework thread) and the SLF4J would access the
StaticLoggerBinder provided by Logback jar packaged within Commons Log bundle.
This is triggered via static initializer logic in
org.slf4j.LoggerFactory#getILoggerFactory [2]. Note that by this time Commons
Log bundle has not be started. In this process at some stage it needs lock on
the Commons Log classloader
# Commons Log bundle itself require access to the return value of
ILoggerFactory.getILoggerFactory() in its BundleActivator.start as its that
singleton LoggerContext instance which needs to be used by LogbackManager for
further configuration. It currently hold the Common Log classloader lock
This lead to a deadlock and stalls the system. One possible fix I can think of
is
# Commons Log activator checks if the LoggerFactory is initialized i.e.
{{LoggerFactory.getILoggerFactory() instanceof LoggerContext}}
# If not initialized then it would starting the initialization in a separate
thread where it would wait till {{LoggerFactory}} is initialized and only then
start LogbackManager
Thoughts?
[1] http://bugzilla.slf4j.org/show_bug.cgi?id=106
[2]
https://github.com/qos-ch/slf4j/blob/master/slf4j-api/src/main/java/org/slf4j/LoggerFactory.java#L292
> LogbackManager causes lockup on second startup
> ----------------------------------------------
>
> Key: SLING-3189
> URL: https://issues.apache.org/jira/browse/SLING-3189
> Project: Sling
> Issue Type: Bug
> Components: Commons
> Affects Versions: Commons Log 3.0.2
> Reporter: Bertrand Delacretaz
> Attachments: SLING-3189.stack.txt
>
>
> Looks like I didn't test yesterday's SLING-3185 fix thoroughly enough, I'm
> seeing a system lockup on second startup of the current trunk launchpad.
> Steps to reproduce:
> 1. Start launchpad/builder runnable jar with no existing sling folder, Sling
> starts fine.
> 2. Stop and restart.
> 3. Sling doesn't start, the BundleWiringImpl waits forever on a classloading
> call triggered by StaticLoggerBinder.getSingleton()
> Stack trace:
> Daemon Thread [FelixStartLevel] (Suspended)
> owns: BundleWiringImpl$BundleClassLoaderJava5 (id=40)
> waiting for: HashMap<K,V> (id=41)
> Object.wait(long) line: not available [native method]
> HashMap<K,V>(Object).wait() line: 485
>
> BundleWiringImpl$BundleClassLoaderJava5(BundleWiringImpl$BundleClassLoader).findClass(String)
> line: 2050
> BundleWiringImpl.findClassOrResourceByDelegation(String, boolean) line:
> 1472
> BundleWiringImpl.access$400(BundleWiringImpl, String, boolean) line: 75
>
> BundleWiringImpl$BundleClassLoaderJava5(BundleWiringImpl$BundleClassLoader).loadClass(String,
> boolean) line: 1923
> BundleWiringImpl$BundleClassLoaderJava5(ClassLoader).loadClass(String)
> line: 247
> LogbackManager.ensureSlf4jIsInitialized() line: 147
> LogbackManager.<init>(BundleContext) line: 103
> Activator.start(BundleContext) line: 55
--
This message was sent by Atlassian JIRA
(v6.1#6144)