Deadlock On Component Uninstall (using Seda Flow)
-------------------------------------------------

                 Key: SM-1858
                 URL: https://issues.apache.org/activemq/browse/SM-1858
             Project: ServiceMix
          Issue Type: Bug
          Components: servicemix-core
    Affects Versions: 3.3
         Environment: SunOS 5.10 Generic_137111-06 sun4u sparc SUNW,Sun-Fire-880
Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_01-b06, mixed mode)
ServiceMix 3.3

            Reporter: Corey Baswell


I've recently updated servicemix from 3.1 to 3.3 and have started seeing a 
deadlock condition when we attempt to reinstall a component on a running 
system. Not sure if this bug was present prior to 3.3 but I've never seen it in 
3.1. I'll try to explain the scenario. There are two components involved A & B. 
Component A calls B and returns a response to its caller. The situation that 
causes the deadlock is if I try to do an uninstall component A (first step in a 
reinstall) while component A is waiting for a response from component B. 

The time line for the locks that cause this deadlock are:

 # Component A send a synchronous request to component B through the SedaFlow. 
A read lock is established on the flow. 
 # Since the request is synchronous the Component A request thread waits on the 
MessageExchange object (to be notified when a response is ready).
 # A reinstall of component A is triggered. The 
org.apache.servicemix.jbi.framework.InstallationService.unloadInstaller is 
called to first remove this component. The first thing thing the 
InstallationService does is to suspend the broker which in turn suspends the 
SedaFlow. Before the Seda Flow can be suspended a write lock must be acquired 
on the flow  however this write lock cannot be acquired until the read lock 
from step 1. is released.
 # Component B finishes its request and is now ready to return the response. 
Before it calls notify on the MessaeExchange lock in step 2. (allowing 
Component A to finish its request) it first must acquire a read lock on the 
SedaFlow lock. However it can't acquire this read lock because of the waiting 
write lock in step 3. 
 # Deadlock

I'm not sure what the best way to fix this is but since I don't understand the 
interaction of ServiceMix's internals enough to dork with the synchronization 
I'm going to change the write lock attempt in the suspend() method of  
AbstractFlow to timeout after a couple of seconds. Something like:

{code}
    public synchronized void suspend() {
        if (log.isDebugEnabled()) {
            log.debug("Called Flow suspend");
        }
        try
        {
          lock.writeLock().tryLock(10, TimeUnit.SECONDS);
        }
        catch (InterruptedException iexc)
        {
          throw new RuntimeException("Unable to suspend flow because write lock 
could not be acquired.");
        }
        suspendThread = Thread.currentThread();
    }

{code}

I think this will work in my scenario because I'm only using one flow (seda). 
If multiple flows were being used however it would be possible for some to be 
suspended before this exception would be thrown and that could leave everything 
in a bad state (i.e. this is definitely a hack).

I've also attached the stack traces from the time line above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to