[ https://issues.apache.org/jira/browse/FELIX-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743821#comment-17743821 ]
Tom Watson commented on FELIX-6616: ----------------------------------- My particular case is using a dynamic, greedy reference with 1..1 cardinality. The issue only happens when activation is happening on one thread and it needs to bind one service before it can call the activate method. The logic here finds an existing service X with ranking=1. During this process it activates a tracker for service X and at this time the tracker may get a service event on another thread for registration of service X ranking=100. At this point that thread will find the component instance (that is in the middle of getting activated) and it will bind the service X ranking=100. It is here that the bad ordering may happen. The issue here is that the binding during activation goes through a bit different code path than the code that reacts to service events dynamically. So the binding during activation does not know another thread is reacting to a new service registration and it will bind the lower ranked service even though the newly bound X ranking=100 already happened. I have a fix that greatly reduces the window for the failing case, but it is not perfect. I hesitate to do a more advanced fix because I see no way to do it without blocking. I will try to get my suggested "fix" up soon after I verify it works against the tests we have in Open Liberty. But I caution that it will not be a perfect fix. > Dynamic greedy 1..1 references may activate with no reference service bound > --------------------------------------------------------------------------- > > Key: FELIX-6616 > URL: https://issues.apache.org/jira/browse/FELIX-6616 > Project: Felix > Issue Type: Bug > Components: Declarative Services (SCR) > Affects Versions: scr-2.2.6 > Reporter: Tom Watson > Assignee: Tom Watson > Priority: Major > Fix For: scr-2.2.8 > > > If using a 1..1 cardinality for a dynamic greedy reference there is a timing > issue possible which will cause SCR to unbind all reference when activating > the component. > The timing window involves at least two threads. > 1) thread 1 is in the process of activating the component with a 1..1 dynamic > greedy reference to service X (ranking=1) > 2) thread 2 is in the process of registering another service X with a higher > service ranking=100 > When this happens thread 1 determines it should bind service X ranking 1. > Thread 1 creates the service component instance and enables the tracking of > all the dependencies. It then proceeds to bind all the required services. > Before binding service X ranking=1 thread 2 registers service X ranking=100. > Thread 2 then finds the created service component instance that thread 1 has > created (but is still in the process of binding all the services) and then > thread 2 binds service X ranking=100. At this point thread 2 thinks the > component actually got bound to service X ranking=1 so it begins to unbind > it. Before unbinding, if thread 1 proceeds it will bind service X ranking=1. > So now the component itself (depending on the implementation) will start > using X ranking=1. But then thread 2 proceeds and it will unbind service X > ranking=1. > So basically we are left with this flow: > 1) bind X ranking=100 > 2) bind X ranking=1 > 3) unbind X ranking =1 > At this point the component will be confused and think it likely has no > services to use. -- This message was sent by Atlassian Jira (v8.20.10#820010)