On Mon, 2008-03-03 at 10:22 +0000, Aidan Skinner wrote: > On Fri, Feb 29, 2008 at 8:23 PM, Rafael Schloming <[EMAIL PROTECTED]> wrote: > > > The deadlock is between the session's _messageDeliveryLock and the > > Dispatcher's _lock. The dispatcher thread's main loop attempts to > > acquire _lock first and then _messageDeliveryLock. The main thread > > acquires _messageDeliveryLock first thing on close(...) and then > > subsequently many levels down the stack attempts to acquire _lock in > > order to call Dispatcher.rejectPending(...). > > We found (and fixed) several similar problems on M2, it's probably > worth having a look at that code to make sure there aren't any cases > being missed. > > - Aidan >
Rafael is absolutely right, Session.close is holding a lock on _messageDeliveryLock only to prevent messages being delivered to consumers when the session is closing (note that this lock should also be held when consumer are closing, see Qpid-812). There is however no reason for the dispatcher thread to keep a lock on _lock when dispatching a message. _lock is used for stopping/starting the dispatcher thread and _messageDeliveryLock for preventing message delivery. So, we should change the run method of dispatcher for releasing _lock and rollback and rejectPending for synchronizing on _messageDeliveryLock. Note that this deadlock situation would not happen with 0.8 as an external thread is calling rejectPending but (if the dispatcher is trying to dispatch a message) this thread will need to wait until close finishes and will create a new dispatcher thread that will not be closed. The way the dispatcher thread is closed should also be changed. The close operation currently interrupts the thread this means that the dispatcher does not have a chance to acquire a lock on _messageDeliveryLock for correctly releasing the message it may be waiting to dispatch. Arnaud
