Ryan Moquin created FELIX-6190:
----------------------------------

             Summary: Declarative services component implementing 
EventHookListener deadlocks SCR.
                 Key: FELIX-6190
                 URL: https://issues.apache.org/jira/browse/FELIX-6190
             Project: Felix
          Issue Type: Bug
          Components: Declarative Services (SCR)
    Affects Versions: scr-2.1.16
            Reporter: Ryan Moquin


When a declarative services component that implements EventHookListener is 
loaded by SCR, a deadlock occurs.  This occurs since the SCR will attempt to 
get the service so it can deliver event notifications to it while it's already 
in the process of loading the service.  Here is a breakdown of the deadlock 
stacktrace we ran into, I spent some time identifying the services that are 
being interacted with at the various stages in the thread stacktraces to come 
to this conclusion.  After some thinking, it seems like the fix would be to 
check if an EventHookListener that needs to be loaded matches the service that 
is in progress of being loaded.  I THINK that would prevent this deadlock from 
occurring.  Obviously this problem can be worked around, but obviously is 
confusing when it occurs.  Scott Lewis (who run the ECF project said it was 
intermittent for him), I ran into it with Equinox first, switched to Felix and 
then ran into it everytime I ran the project using an exported bndtools jar 
with the ECF.  Scott initially logged this against Equinox and there was some 
discussion there.  I'm attaching the issue to this one in case useful.

In the below breakdown and stacktraces, the TopologyManager class (from the ECF 
project) is being loaded by the SCR.  That class implements the 
EventHookListener interface:

 

Main thread:
SCR tries to register the TopologyManager
Service event type 1 is fired
Equinox/Felix iterates the event listener hooks for which the TopologyManager 
is one, so it tries to get the TopologyManager service (to do the notification).
an attempt to retrieve the service count service to update the change count
ComponentRegistry updateChangeCount method is called
locks on monitor changeCountLock

 

Timer Thread 0:
ComponentRegistry locks the changeCountLock
SCR service, properties modified - service.changecount
fires event 2
tries to retrieve TopologyManager, because it's EventListenerHook to notify of 
the event
then waits on servicecount latch for static class in ServiceHolder

 

Stack trace from Scott, I didn't save the stack traces from the threads I was 
investigating, but I can easy get them if my above explanation isn't helpful 
enough to reproduce with.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to