[
https://issues.apache.org/jira/browse/SLING-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206306#comment-15206306
]
Carsten Ziegeler commented on SLING-5622:
-----------------------------------------
The activation of a newly added handler is now split into different blocks,
each block by itself is synchronized, however in between anything can happen.
If a handler is added, the previously active handler should be deactivated. As
this happens in the second block, there is the rare chance that this handler
has already been activated again. To avoid deactivation in this situation, I've
added the count. But while explaining this, the only situation where this could
happen is when the handler which is added is removed again - and that while the
adding method is still in progress. I think that's not possible. So we can
probably remove the logic
> Deadlock between service unregister (thus unbindTopologyEventListener) and
> discoveryLiteCheck
> ---------------------------------------------------------------------------------------------
>
> Key: SLING-5622
> URL: https://issues.apache.org/jira/browse/SLING-5622
> Project: Sling
> Issue Type: Bug
> Components: Extensions
> Affects Versions: Resource Resolver 1.4.8, Discovery Oak 1.2.6
> Environment: discovery.oak 1.2.6
> resourceresolver 1.4.8
> Reporter: Stefan Egli
> Assignee: Carsten Ziegeler
> Priority: Critical
> Fix For: Resource Resolver 1.4.10
>
>
> There's a java level deadlock between the following two threads, excerpts
> below. It's not yet clear if this is an issue of discovery.oak or
> resourceresolver.
> thread 1:
> {noformat}
> "LeaseFailureHandler-Thread" daemon prio=5 tid=0x7f34 nid=0xffffffff in
> Object.wait()
> java.lang.Thread.State: WAITING (on object monitor)
> at sun.misc.Unsafe.park(Native Method)
> - waiting to lock <0x575a7644> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync) owned by
> "discovery.connectors.common.runner.e8dd34be-8886-4e5d-891f-5509b4dea0f0.discoveryLiteChec
> k" tid=0x4,330
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at
> org.apache.sling.discovery.oak.OakDiscoveryService.unbindTopologyEventListener(OakDiscoveryService.java:368)
> ...
> at
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.unregisterFactory(ResourceResolverFactoryActivator.java:611)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.unregisterFactory(ResourceResolverFactoryActivator.java:602)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.checkFactoryPreconditions(ResourceResolverFactoryActivator.java:674)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.access$100(ResourceResolverFactoryActivator.java:79)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator$1.providerRemoved(ResourceResolverFactoryActivator.java:500)
> at
> org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.unregister(ResourceProviderTracker.java:224)
> - locked <0x35cd7613> (a java.util.HashMap)
> at
> org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.access$100(ResourceProviderTracker.java:58)
> at
> org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker$1.removedService(ResourceProviderTracker.java:109)
> ...
> at
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
> at
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.unregisterService(AbstractSlingRepositoryManager.java:262)
> at
> org.apache.sling.jcr.base.AbstractSlingRepositoryManager.stop(AbstractSlingRepositoryManager.java:389)
> ...
> at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:1038)
> at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:1024)
> at
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService$1.handleLeaseFailure(DocumentNodeStoreService.java:413)
> at
> org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo$1.run(ClusterNodeInfo.java:696)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
> - locked <0x4402f4fd> (a
> java.util.concurrent.locks.ReentrantLock$FairSync)
> - locked <0x1e2230ed> (a
> java.util.concurrent.locks.ReentrantLock$FairSync)
> - locked <0x56ba270f> (a
> java.util.concurrent.locks.ReentrantLock$FairSync)
> {noformat}
> and thread 2:
> {noformat}
> "discovery.connectors.common.runner.e8dd34be-8886-4e5d-891f-5509b4dea0f0.discoveryLiteCheck"
> daemon prio=5 tid=0x10ea nid=0xffffffff waiting for monitor entry
> java.lang.Thread.State: BLOCKED
> at
> org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.getResourceProviderStorage(ResourceProviderTracker.java:364)
> - waiting to lock <0x35cd7613> (a java.util.HashMap) owned by
> "LeaseFailureHandler-Thread" tid=0x32,564
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverImpl.createControl(ResourceResolverImpl.java:154)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:116)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:110)
> at
> org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolverInternal(CommonResourceResolverFactoryImpl.java:257)
> at
> org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getAdministrativeResourceResolver(CommonResourceResolverFactoryImpl.java:140)
> at
> org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getAdministrativeResourceResolver(ResourceResolverFactoryImpl.java:107)
> at
> org.apache.sling.discovery.oak.cluster.OakClusterViewService.getResourceResolver(OakClusterViewService.java:103)
> at
> org.apache.sling.discovery.oak.cluster.OakClusterViewService.getLocalClusterView(OakClusterViewService.java:110)
> at
> org.apache.sling.discovery.base.commons.BaseDiscoveryService.getTopology(BaseDiscoveryService.java:77)
> at
> org.apache.sling.discovery.oak.OakDiscoveryService.checkForTopologyChange(OakDiscoveryService.java:657)
> at
> org.apache.sling.discovery.oak.pinger.OakViewChecker.discoveryLiteCheck(OakViewChecker.java:232)
> - locked <0x740a9729> (a java.lang.Object)
> at
> org.apache.sling.discovery.oak.pinger.OakViewChecker.access$000(OakViewChecker.java:64)
> at
> org.apache.sling.discovery.oak.pinger.OakViewChecker$1.run(OakViewChecker.java:208)
> at
> org.apache.sling.discovery.base.commons.PeriodicBackgroundJob.safelyRun(PeriodicBackgroundJob.java:86)
> at
> org.apache.sling.discovery.base.commons.PeriodicBackgroundJob.run(PeriodicBackgroundJob.java:77)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
> - locked <0x575a7644> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)