Deadlock On Component Uninstall (using Seda Flow)
-------------------------------------------------
Key: SM-1858
URL: https://issues.apache.org/activemq/browse/SM-1858
Project: ServiceMix
Issue Type: Bug
Components: servicemix-core
Affects Versions: 3.3
Environment: SunOS 5.10 Generic_137111-06 sun4u sparc SUNW,Sun-Fire-880
Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_01-b06, mixed mode)
ServiceMix 3.3
Reporter: Corey Baswell
I've recently updated servicemix from 3.1 to 3.3 and have started seeing a
deadlock condition when we attempt to reinstall a component on a running
system. Not sure if this bug was present prior to 3.3 but I've never seen it in
3.1. I'll try to explain the scenario. There are two components involved A & B.
Component A calls B and returns a response to its caller. The situation that
causes the deadlock is if I try to do an uninstall component A (first step in a
reinstall) while component A is waiting for a response from component B.
The time line for the locks that cause this deadlock are:
# Component A send a synchronous request to component B through the SedaFlow.
A read lock is established on the flow.
# Since the request is synchronous the Component A request thread waits on the
MessageExchange object (to be notified when a response is ready).
# A reinstall of component A is triggered. The
org.apache.servicemix.jbi.framework.InstallationService.unloadInstaller is
called to first remove this component. The first thing thing the
InstallationService does is to suspend the broker which in turn suspends the
SedaFlow. Before the Seda Flow can be suspended a write lock must be acquired
on the flow however this write lock cannot be acquired until the read lock
from step 1. is released.
# Component B finishes its request and is now ready to return the response.
Before it calls notify on the MessaeExchange lock in step 2. (allowing
Component A to finish its request) it first must acquire a read lock on the
SedaFlow lock. However it can't acquire this read lock because of the waiting
write lock in step 3.
# Deadlock
I'm not sure what the best way to fix this is but since I don't understand the
interaction of ServiceMix's internals enough to dork with the synchronization
I'm going to change the write lock attempt in the suspend() method of
AbstractFlow to timeout after a couple of seconds. Something like:
{code}
public synchronized void suspend() {
if (log.isDebugEnabled()) {
log.debug("Called Flow suspend");
}
try
{
lock.writeLock().tryLock(10, TimeUnit.SECONDS);
}
catch (InterruptedException iexc)
{
throw new RuntimeException("Unable to suspend flow because write lock
could not be acquired.");
}
suspendThread = Thread.currentThread();
}
{code}
I think this will work in my scenario because I'm only using one flow (seda).
If multiple flows were being used however it would be possible for some to be
suspended before this exception would be thrown and that could leave everything
in a bad state (i.e. this is definitely a hack).
I've also attached the stack traces from the time line above.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.