[ 
https://issues.apache.org/jira/browse/SLING-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800473#comment-13800473
 ] 

Chetan Mehrotra commented on SLING-3189:
----------------------------------------

>From seeing the thread dump I think following is happening

# Sling installer installs all bundles at a given Start Level like 1
# At start level 1 we have various bundles like Commons Log and Jetty
# Jetty has an optional dependency on SLF4J
# Once all the bundles are resolved the bundles are started. In some cases 
Jetty bundle gets started first. 
# If Jetty gets started first it triggers SLF4J initialization in a separate 
thread (diff from main Framework thread) and the SLF4J would access the 
StaticLoggerBinder provided by Logback jar packaged within Commons Log bundle. 
This is triggered via static initializer logic in 
org.slf4j.LoggerFactory#getILoggerFactory [2]. Note that by this time Commons 
Log bundle has not be started. In this process at some stage it needs lock on 
the Commons Log classloader
# Commons Log bundle itself require access to the return value of 
ILoggerFactory.getILoggerFactory() in its BundleActivator.start as its that 
singleton LoggerContext instance which needs to be used by LogbackManager for 
further configuration. It currently hold the Common Log classloader lock

This lead to a deadlock and stalls the system. One possible fix I can think of 
is

# Commons Log activator checks if the LoggerFactory is initialized i.e. 
{{LoggerFactory.getILoggerFactory() instanceof LoggerContext}}
# If not initialized then it would starting the initialization in a separate 
thread where it would wait till {{LoggerFactory}} is initialized and only then 
start LogbackManager

Thoughts?

[1] http://bugzilla.slf4j.org/show_bug.cgi?id=106
[2] 
https://github.com/qos-ch/slf4j/blob/master/slf4j-api/src/main/java/org/slf4j/LoggerFactory.java#L292

> LogbackManager causes lockup on second startup
> ----------------------------------------------
>
>                 Key: SLING-3189
>                 URL: https://issues.apache.org/jira/browse/SLING-3189
>             Project: Sling
>          Issue Type: Bug
>          Components: Commons
>    Affects Versions: Commons Log 3.0.2
>            Reporter: Bertrand Delacretaz
>         Attachments: SLING-3189.stack.txt
>
>
> Looks like I didn't test yesterday's SLING-3185 fix thoroughly enough, I'm 
> seeing a system lockup on second startup of the current trunk launchpad.
> Steps to reproduce:
> 1. Start launchpad/builder runnable jar with no existing sling folder, Sling 
> starts fine.
> 2. Stop and restart.
> 3. Sling doesn't start, the BundleWiringImpl waits forever on a classloading 
> call triggered by StaticLoggerBinder.getSingleton()
> Stack trace:
> Daemon Thread [FelixStartLevel] (Suspended)   
>       owns: BundleWiringImpl$BundleClassLoaderJava5  (id=40)  
>       waiting for: HashMap<K,V>  (id=41)      
>       Object.wait(long) line: not available [native method]   
>       HashMap<K,V>(Object).wait() line: 485   
>       
> BundleWiringImpl$BundleClassLoaderJava5(BundleWiringImpl$BundleClassLoader).findClass(String)
>  line: 2050        
>       BundleWiringImpl.findClassOrResourceByDelegation(String, boolean) line: 
> 1472    
>       BundleWiringImpl.access$400(BundleWiringImpl, String, boolean) line: 75 
>       
> BundleWiringImpl$BundleClassLoaderJava5(BundleWiringImpl$BundleClassLoader).loadClass(String,
>  boolean) line: 1923       
>       BundleWiringImpl$BundleClassLoaderJava5(ClassLoader).loadClass(String) 
> line: 247        
>       LogbackManager.ensureSlf4jIsInitialized() line: 147     
>       LogbackManager.<init>(BundleContext) line: 103  
>       Activator.start(BundleContext) line: 55 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to