On Jan 30, 2008 11:31 AM, Stuart McCulloch <[EMAIL PROTECTED]> wrote: > On 30/01/2008, Felix Meschberger <[EMAIL PROTECTED]> wrote: > > > > Hi Niclas, > > > > The problem is (a) the generous synchronisation of Log4J and (b) locking > > used by stuff used for class loading. In our projects we regularly face > > issues between Log4J and our ClassLoader implementations synchronizing > > on ClassLoader.loadClass(). > > > > The deadlock occurrs because both parties - framework and Log4J - lock > > "big" parts of their code and call to code outside of their scope while > > being locked: the framework calls the LogService outside of the > > framework and Log4J calls into class loading outside of Log4J. > > > > On solution, I could imagine, is not using Log4J, which may or may not > > be an option. Maybe SLF4J or Logback could be an option here ? [ In > > Sling we actually use Logback as a logging backend for our LogService > > implementation ] > > > > Another solution would be to enhance the framework Logger to diable the > > use of a LogService. E.g. by defining a framework property, which when > > set, causes the Logger to never use the LogService. > > > > Both solutions don't sound right ... > > > other possible solutions: > > a) have a separate thread make the LogService call (fed from a queue) > although you'd have to be careful not to introduce other deadlocks
So would it be acceptable to deliver log calls asynchronously? If so I can probably make that change quickly ... regards, Karl > b) delay sending log messages from critical sections of the framework > ie. log to a buffer, then send the messages when it's safe to do so > > > Regards > > Felix > > > > Am Mittwoch, den 30.01.2008, 13:51 +0800 schrieb Niclas Hedhman: > > > On Tuesday 29 January 2008 16:55, Karl Pauls wrote: > > > > Could you have them retry using Felix 1.0.3? This might be related to > > > > some of the concurrency things we fixed. > > > > > > > > In case they can not be bothered retrying on Felix 1.0.3 then maybe > > > > they can provide a minimal config file that only uses publicly > > > > available bundles and has this issue (then I can look into it). > > > > > > I am looking at the code in the trunk, and it appears that the locking > > that > > > triggers this is in place. > > > > > > As always, it is a bit tricky to setup threading problems. SO, I would > > like to > > > run a "head exercise" first. > > > > > > 1. The Starter thread "FelixStartLevel" locks the ModuleFactoryImpl > > instance > > > in R4SearchPolicyCore.resolve(). > > > > > > 2. The Configuration Admin thread "Configuration Updater" calls the > > LogService > > > with the Configuration instance, which leads to Log4J locks on its own > > > RootLogger and calls ClassLoader.loadClass() on something found in the > > > configuration. This leads to trying to acquire the ModuleFactoryImpl > > lock > > > either in R4SearchPolicyCore.resolve() or in the provided stack trace > > > R4SearchPolicyCore.getInUseCandidates() due to a ClassNotFound in the > > > previous step. > > > > > > 3. The "FelixStartLevel" thread reaches > > > m_logger.log(Logger.LOG_DEBUG, "WIRE: " + wires[wireIdx]); > > > in R4SearchPolicyCore.createWires() and he log() method will try to > > acquire > > > the RootLogger lock. > > > > > > 4. DEADLOCK. > > > > > > > > > I agree this is very special to the LogService, since Felix binds to it > > and > > > uses it for its internal use, and the responsibility is across two > > different > > > systems. Suggestions are welcome. > > > > > > > > > Cheers > > > > > > > -- > Cheers, Stuart > -- Karl Pauls [EMAIL PROTECTED]
