[
https://issues.apache.org/jira/browse/FELIX-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
metatech updated FELIX-3067:
----------------------------
Attachment: felix_unblock_deadlock.patch
This patch "felix_unblock_deadlock" detects the deadlock and throws an
exception so that the calling thread gives up obtaining the Felix global lock
and releases its Java lock (for instance a Blueprint lock) and gives a chance
to the other thread (for instance "FelixPackageAdmin") to acquire both the
Felix global lock AND the Java lock.
The calling thread will most probably end up in "failed" or "error", and its
deployment will need to be restarted, but this is better than completely
freezing the Felix container, including blocking Karaf console commands.
> Prevent Deadlock Situation in Felix.acquireGlobalLock
> -----------------------------------------------------
>
> Key: FELIX-3067
> URL: https://issues.apache.org/jira/browse/FELIX-3067
> Project: Felix
> Issue Type: Improvement
> Components: Framework
> Affects Versions: framework-3.0.7, framework-3.0.8, framework-3.0.9,
> framework-3.2.0, framework-3.2.1, fileinstall-3.1.10
> Reporter: Felix Meschberger
> Attachments: FELIX-3067-sling.patch, FELIX-3067.patch,
> felix_unblock_deadlock.patch, threaddump-ise-deadlock.txt,
> threads_locked_by_camel_type_converter
>
>
> Every now and then we encounter deadlock situations which involve the
> Felix.acquireGlobalLock method. In our use case we have the following aspects
> which contribute to this:
> (a) The Apache Felix Declarative Services implementation stops components
> (and thus causes service unregistration) while the bundle lock is being held
> because this happens in a SynchronousBundleListener while handling the
> STOPPING bundle event. We have to do this to ensure the bundle is not really
> stopped yet to properly stop the bundle's components.
> (b) Implementing a special class loader which involves dynamically resolving
> packages which in turn uses the global lock
> (c) Eclipse Gemini Blueprint implementation which operates asynchronously
> (d) synchronization in application classes
> Often times, I would assume that we can self-heal such complex deadlck
> situations, if we let acquireGlobalLock time out. Looking at the calles of
> acquireGlobalLock there seems to already be provision to handle this case
> since acquireGlobalLock returns true only if the global lock has actually
> been acquired.
> This issue is kind of a companion to FELIX-3000 where deadlocks involve
> sending service registration events while holding the bundle lock.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)