[ 
https://issues.apache.org/jira/browse/DOSGI-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amichai Rothman updated DOSGI-163:
----------------------------------

    Attachment: fix_importregistrationimpl_2.diff
                fix_importregistrationimpl_1.diff

Ok, here is an updated patch which should address both deadlocks. For the 
reviewer's convenience, it is split into two patch files - the first is a bunch 
of cleanup, refactoring and documenting which should make the class more clear, 
concise and straightforward (and hopefully less error-prone), and the second is 
the actual synchronization fix on top of the first.

The first patch doesn't include any functional changes (except for log texts), 
but only makes future changes easier and clearer (hopefully). It consists of 
the following changes:

- deleted ImportReferenceImpl, which was a simple wrapper delegating everything 
back to ImportRegistrationImpl, and made the latter implement ImportReference 
directly instead (which it basically already did)
- added clarifying comments and javadocs
- renamed init() to initParent(), childs to children, and detatched to detached 
for clarity and correctness
- reordered methods to be better grouped by their relationships
- unified checks for isFailure and closed into a single isInvalid() check
- removed redundant check of closed before calling close() (which already 
checks that)
- replaced cumbersome methods with simple returned conditionals
- removed unused methods
- removed accessors which were used only in one place (internally)
- made children use parent's importedService reference, instead of always 
setting the same reference on all children

The second patch is the one that addresses the deadlock issue at hand, by 
reworking the synchronized methods into finer synchronized blocks, such that 
they guard only the internal state of the class from inconsistencies, but make 
sure that no locks are held when invoking external methods (most of which have 
their own locks - the previous double-locking is how the deadlocks came to be 
in the first place).

Comments, suggestions and corrections are most welcome (and committing the 
patches to trunk too, if all is well! ;-) )

                
> Deadlock when shutting down
> ---------------------------
>
>                 Key: DOSGI-163
>                 URL: https://issues.apache.org/jira/browse/DOSGI-163
>             Project: CXF Distributed OSGi
>          Issue Type: Bug
>          Components: DSW
>    Affects Versions: 1.4.0
>         Environment: Oracle JDK 1.7.0_17, Karaf 2.3.1, DOSGi 1.4.0
>            Reporter: Amichai Rothman
>         Attachments: fix_dosgi_unregister_deadlock.diff, 
> fix_importregistrationimpl_1.diff, fix_importregistrationimpl_2.diff
>
>
> Karaf sometimes hangs while shutting down, and a thread dump shows a deadlock 
> in DOSGi. In general, the OSGi specs recommend not to hold any locks when 
> calling the APIs since it can result in a deadlock, as is the case here.
> A proposed fix calls ServiceRegistration.unregister outside of the 
> synchronized block in ImportRegistrationImpl.close (note that it still is 
> synchronized when called from closeAll, which may pose a related but separate 
> problem - but the fix does fix this specific deadlock as it was encountered 
> here).
> Found one Java-level deadlock:
> =============================
> "pool-18-thread-3":
>   waiting to lock monitor 0x00007f0658003f40 (object 0x00000000e5b485b0, a 
> org.eclipse.osgi.internal.serviceregistry.ServiceUse),
>   which is held by "Blueprint Extender: 2"
> "Blueprint Extender: 2":
>   waiting to lock monitor 0x00007f0654002a50 (object 0x00000000f1b70240, a 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl),
>   which is held by "pool-18-thread-3"
> Java stack information for the threads listed above:
> ===================================================
> "pool-18-thread-3":
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.releaseService(ServiceRegistrationImpl.java:562)
>         - waiting to lock <0x00000000e5b485b0> (a 
> org.eclipse.osgi.internal.serviceregistry.ServiceUse)
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:245)
>         at 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.instanceClosed(ImportRegistrationImpl.java:118)
>         - locked <0x00000000f1b70240> (a 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl)
>         at 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.close(ImportRegistrationImpl.java:100)
>         - locked <0x00000000f1b70240> (a 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl)
>         at 
> org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.unexportNotAvailableServices(TopologyManagerImport.java:227)
>         at 
> org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport.access$200(TopologyManagerImport.java:53)
>         at 
> org.apache.cxf.dosgi.topologymanager.importer.TopologyManagerImport$2.run(TopologyManagerImport.java:201)
>         - locked <0x00000000ea71b6c0> (a java.util.HashMap)
>         - locked <0x00000000ea71b6f8> (a java.util.HashMap)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> "Blueprint Extender: 2":
>         at 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl.closeAll(ImportRegistrationImpl.java:127)
>         - waiting to lock <0x00000000f1b70240> (a 
> org.apache.cxf.dosgi.dsw.service.ImportRegistrationImpl)
>         at 
> org.apache.cxf.dosgi.dsw.service.ClientServiceFactory.remove(ClientServiceFactory.java:110)
>         at 
> org.apache.cxf.dosgi.dsw.service.ClientServiceFactory.ungetService(ClientServiceFactory.java:104)
>         - locked <0x00000000f1b6f7e0> (a 
> org.apache.cxf.dosgi.dsw.service.ClientServiceFactory)
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceUse$2.run(ServiceUse.java:238)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceUse.ungetService(ServiceUse.java:236)
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.ungetService(ServiceRegistrationImpl.java:518)
>         - locked <0x00000000e5b485b0> (a 
> org.eclipse.osgi.internal.serviceregistry.ServiceUse)
>         at 
> org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.ungetService(ServiceRegistry.java:510)
>         at 
> org.eclipse.osgi.framework.internal.core.BundleContextImpl.ungetService(BundleContextImpl.java:636)
>         at 
> org.apache.aries.blueprint.container.ReferenceListRecipe$ServiceDispatcher.destroy(ReferenceListRecipe.java:197)
>         - locked <0x00000000e5bab728> (a 
> org.apache.aries.blueprint.container.ReferenceListRecipe$ServiceDispatcher)
>         at 
> org.apache.aries.blueprint.container.ReferenceListRecipe.untrack(ReferenceListRecipe.java:147)
>         - locked <0x00000000eaef82d0> (a java.lang.Object)
>         at 
> org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe.serviceRemoved(AbstractServiceReferenceRecipe.java:370)
>         at 
> org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe.access$200(AbstractServiceReferenceRecipe.java:72)
>         at 
> org.apache.aries.blueprint.container.AbstractServiceReferenceRecipe$3.run(AbstractServiceReferenceRecipe.java:324)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at 
> org.apache.aries.blueprint.container.ExecutorServiceWrapper.run(ExecutorServiceWrapper.java:106)
>         at 
> org.apache.aries.blueprint.utils.threading.impl.DiscardableRunnable.run(DiscardableRunnable.java:48)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> Found 1 deadlock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to