Hi,

Am 20.04.2012 um 00:19 schrieb David Jencks:

> We've run into one definite concurrency problem in SCR and I've been 
> discussing offline with a colleague how to fix it and wanted to get the 
> discussion out in the open.
> 
> The original symptom was when 2 mandatory service refs were satisfied on 
> different threads at once: the 2nd wasn't recognized so the component never 
> got activated.
> 
> This is easily solved by synchronizing but this introduces risk of deadlocks 
> (my first attempt, 
> https://issues.apache.org/jira/secure/attachment/12522537/FELIX-3456-1.diff)

Yes

> 
> We tried some partly asynchronous approaches such as 
> https://issues.apache.org/jira/secure/attachment/12523313/FELIX-3456-4.diff.  
> Unless there's a timeout (presumably due to deadlock) this gets all service 
> events processed before the thread exits from its first call into SCR.  
> However this can result in service events getting processed later than one 
> expects possibly on a different thread.  On further thought we concluded that 
> a service event must be processed fully before the service registration call 
> returns.  We therefore don't think any kind of asynchronous approach will 
> work.

Yes. For activation it might cause SCR to not terminate processing before the 
synchronous bundle event handling ends. More importantly, though, unbinding 
services must be handled synchronously to prevent errors in the components 
caused by SCR calling the unbind methods when the bound service object is 
already invalid.


> 
> We've discovered the anti-circular-dependency clause in the spec (112.3.5) 
> but it appears to be overly biased towards SCR-only graphs of services.  We 
> are leaning towards thinking that SCR also needs to consider:
> 
> - an activate method registers a service that satisfies an optional 
> dependency of a component being activated by scr on the same thread.
> - the same, except the activate method starts a new thread to register the 
> service and waits for it to complete.
> 

You can come up with lots of scnearios here. Thing is always, that an event may 
happen for the component to be processed while its state is changing. This is 
particularly problematic during activation and deactivation (due to missing 
dependencies).

> Another scenario to consider is
> 
> components C1 and C2 registering as services, each with an optional dynamic 
> dependency on the other.  If one starts, and then the other, there is no 
> problem, they both get references to the other.  If they both start at the 
> same time in separate threads (either because they are in different bundles 
> or because they get activated due to mandatory references being satisfied) 
> and register the services while the other is in the Activating state, a 
> simple lock over the service event processing will result in deadlock.  
> Furthermore, to get the correct result, at least one of the services has to 
> be bound while the component to which is is binding is in the Activating 
> state.

Dynamic binding of optional services is not a big issue. Because this is known 
to happen at any time and because such events are fully processed calling the 
bind and unbind methods even during activation.

> 
> It looks like the situation can be simplified a bit by considering, for 
> service events, whether the dependency will result in a state change: if it's 
> optional or mandatory but not the only satisfying service, it won't, but if 
> it's mandatory and the first satisfying service, it will.  We can calculate 
> this before calling any bind methods or activate methods.  After determining 
> this, we know the final state of the component.

SCR already does this but it only considers the impact of the single reference. 
It does not take any other references into account.

> 
> We're considering whether some kind of 2-stage lock would work:
> 
> one level can change the state and blocks all other threads
> the other level can't change the state and lets stuff like service events for 
> non-state-changing service references be processed according to the final 
> state of the component. (e.g. activating will let bind methods be called on 
> the under-configuration object).
> 
> This does not yet consider bundle event driven state changes or deactivation 
> or delayed component creation or service factories.
> 
> Comments and more scenarios to consider are more than welcome.

I would rather come back to a proposal I already made on the bug:

If a service or configuration event takes place while the component is in the 
transient activating state, the event is placed into a special queue for 
further processing. When the transient state is existing, the queue is checked 
for further actions to take place.

There is only a small number of situations:

   * Service added: This must be handled
   * Service removed: Might deactivate the component immediately.
   * Config update or delete: Might deactivate the component

The problem here is the removal of a service while the component is being 
activated. When we queue this event and handle it later the service has already 
gone and will be in an undefined/unusable state causing problems. But there is 
probably not much we can do about this beause the component might be in the 
activate method and synchronizing at this point in time is risking deadlocks.

Thus, I think the queue for post processing while in activating state sounds 
like the most sensible thing to do (with some small remaning window for things 
going wrong). This is as easy as implementing the deactivate and activate 
methods in the Activating state to enqeue these requests.

Regards
Felix

Reply via email to