The consensus is, that there is no easy fix, other than rewriting some key pieces of the client. Given that a new JMS client is being actively developed, we have decided not to devote any significant time/resources on the older client.
The new JMS client is AMQP 1.0 and the source is here https://git-wip-us.apache.org/repos/asf?p=qpid-jms.git Robbie is the best person to give an accurate picture of the status for this project. Rajith On Thu, Feb 5, 2015 at 2:42 AM, Rob Godfrey <[email protected]> wrote: > To answer the easy question first, 0.32 clients should be totally > compatible with an 0.16 broker. > > In terms of the deadlock issue, I know there are a number of open deadlock > JIRAs that Robbie and Rajith have looked at in the past... Hopefully one of > them will be able to chip in here and discuss the feasibility of fixing > your particular issue for 0.32. > > -- Rob > > On 4 February 2015 at 23:36, Helen Kwong <[email protected]> wrote: > > > Hi Qpid gurus, > > > > We are using 0.16 Java broker and client on 0-10, and we are running into > > deadlock issues on the client that involve AMQSession's > > _messageDeliveryLock and AMQConnection's _failoverMutex, where different > > threads acquire them in different orders. This is leading to major > > production headaches for us and has us very worried. > > > > (I've been looking at 0.16 code mostly but also skimmed the relevant > parts > > in 0.32, which seem largely the same in those places.) > > > > Deadlock Variety 1 > > > > This is an example of a deadlock we see, where the IOReceiver thread > > deadlocks with Session dispatcher thread (we have listeners that call > > session rollback or commit in onMessage()): > > > > IOReceiver > > org.apache.qpid.client.AMQSession.closed(AMQSession.java:818) > ----------> > > waiting for session's messageDeliveryLock > > > > > org.apache.qpid.client.AMQConnection.closeAllSessions(AMQConnection.java:938) > > > > > org.apache.qpid.client.AMQConnection.exceptionReceived(AMQConnection.java:1282) > > ----------> acquires connection's failoverMutex > > > > > org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1057) > > > org.apache.qpid.client.AMQSession_0_10.exception(AMQSession_0_10.java:907) > > > > > org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:182) > > > > > org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:32) > > > > > org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:103) > > > org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:55) > > > org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:50) > > > org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:32) > > org.apache.qpid.transport.Method.delegate(Method.java:159) > > org.apache.qpid.transport.Session.received(Session.java:585) > > org.apache.qpid.transport.Connection.dispatch(Connection.java:412) > > > > > org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:64) > > > > > org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:40) > > > > > org.apache.qpid.transport.MethodDelegate.executionException(MethodDelegate.java:110) > > > > > org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:103) > > > > > org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:54) > > > > > org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:40) > > org.apache.qpid.transport.Method.delegate(Method.java:159) > > org.apache.qpid.transport.Connection.received(Connection.java:367) > > org.apache.qpid.transport.Connection.received(Connection.java:65) > > org.apache.qpid.transport.network.Assembler.emit(Assembler.java:97) > > org.apache.qpid.transport.network.Assembler.assemble(Assembler.java:198) > > org.apache.qpid.transport.network.Assembler.frame(Assembler.java:131) > > org.apache.qpid.transport.network.Frame.delegate(Frame.java:128) > > org.apache.qpid.transport.network.Assembler.received(Assembler.java:102) > > org.apache.qpid.transport.network.Assembler.received(Assembler.java:44) > > > org.apache.qpid.transport.network.InputHandler.next(InputHandler.java:189) > > > > > org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:105) > > > > > org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:44) > > org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152) > > java.lang.Thread.run(Thread.java:745) > > > > Session dispatcher thread > > > > > org.apache.qpid.client.AMQConnection.exceptionReceived(AMQConnection.java:1255) > > ---------> waiting for connection's failoverMutex > > > > > org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1057) > > org.apache.qpid.client.AMQSession_0_10.sync(AMQSession_0_10.java:1034) > > > > > org.apache.qpid.client.AMQSession_0_10.sendSuspendChannel(AMQSession_0_10.java:812) > > org.apache.qpid.client.AMQSession.suspendChannel(AMQSession.java:3075) > > org.apache.qpid.client.AMQSession.rollback(AMQSession.java:1837) > > common.messaging.QpidSession.rollback(QpidSession.java:211) > > > > > common.messaging.QpidMessageHandler.rollbackSession(QpidMessageHandler.java:284) > > > common.messaging.QpidMessageHandler.onMessage(QpidMessageHandler.java:113) > > > > > org.apache.qpid.client.BasicMessageConsumer.notifyMessage(BasicMessageConsumer.java:748) > > > > > org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:141) > > > > > org.apache.qpid.client.BasicMessageConsumer.notifyMessage(BasicMessageConsumer.java:722) > > > > > org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:186) > > > > > org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:54) > > > > > org.apache.qpid.client.AMQSession$Dispatcher.notifyConsumer(AMQSession.java:3454) > > > > > org.apache.qpid.client.AMQSession$Dispatcher.dispatchMessage(AMQSession.java:3393) > > -----------> acquires session's messageDeliverylock > > > > > org.apache.qpid.client.AMQSession$Dispatcher.access$1000(AMQSession.java:3180) > > org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3173) > > > > > org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54) > > org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3316) > > java.lang.Thread.run(Thread.java:745) > > > > The problem is that the IOReceiver thread acquires failoverMutex before > > messageDeliveryLock (for each session), whereas the dispatcher thread > > acquires it in the other order. We also see potential problems where > other > > threads (instead of IOReceiver) can deadlock with the dispatcher thread, > as > > long as it acquires failoverMutex before messageDeliveryLock. Examples we > > can think of: > > > > A) Another thread calling AMQSession.close() > > B) Another thread calling BasicMessageConsumer.close() > > C) Same connection, different session's dispatcher thread, calling > > rollback() or commit() -> sync() -> setCurrentException() -> > > AMQConnection.exceptionReceived() -> AMQConnection.closeAllSessions(), > > which can try to acquire the messageDeliveryLock of another session and > > deadlock with the other session's dispatcher thread > > > > > > Deadlock Variety 2: > > From code inspection, it also appears that AMQConnection.close() can > > deadlock with either AMQSession.close() or BasicMessageConsumer.close() > > (where the session / consumer is on the same connection). This is because > > AMQConnection.close() first acquires the messageDeliveryLock of all its > > sessions in the recursive doClose(), before trying to acquire the > > connection's failoverMutex. But the Session / consumer's close() acquires > > the failoverMutex before messageDeliveryLock. We haven't seen this happen > > but would like to know if this is possible. > > > > > > We'd really appreciate your help on this. Assuming these can be fixed in > > 0.32, we are also wondering if clients are backward compatible -- i.e., > can > > we upgrade only our client to 0.32 while continuing to use the 0.16 > broker? > > > > Thanks, > > Helen > > >
