Hi all,

There are currently a number of issues with the Failover behaviour of
the client which require some attention. It would be good to discuss
them and work towards having the Failover implementation more fully
meet user expectations. I am going to be spending some time working in
this area along with Alex Rudyy in the weeks ahead.

Some of the issues to consider:

1. Non-blocking approach currently leads to correctness issues.

The 0-10 codepath uses a non-blocking Failover model which currently
fails to protect the client from performing certain operations during
Failover, and this can lead to unexpected behaviour. For example,
closing QueueBrowsers during Failover has been observed to cause
issues because it is possible for the client to send the old
subscriptions destination in a cancel command to the new broker as the
close and Failover are allowed to progress concurrently. Failover had
started but not yet completed the resubscription operations, meaning
the the new broker didn't yet know about the destination and so has to
respond by closing the Session with a a NOT_FOUND execution exception.

2. Transacted sessions

With the 0-10 client, any transacted Sessions in use are currently
closed upon Failover occurring, and upon next use of the Session the
client application then gets an exception indicating the Session is
closed. This seems to give little benefit to users from having
Failover while using transactions, which to me actually seems like the
most obvious use case. A further issue with this process is that it is
completely different from the approach taken by the 0-8/9 codepath,
making compatibility during upgrade an issue.

Upon Failover occurring, recreating the Session and providing a means
to indicate the previous transaction was not successful seems like a
more user friendly thing to do as it is more in line with user
expectations of how transactions work, and this is exactly what the
0-8/9 codepath does. JMS provides a TransactionRolledBackException
which can be thrown upon commit(), and this is used in the 0-8/0-9
client codepath when Failover occurs to indicate the transaction is no
longer valid and was rolled back, allowing the client to simply replay
their transaction.

3. [Dead]locks

The current client implementation is rather heavy on locks, and the
various routes for acquiring them has created situations which can
result in deadlock. It would be worth investigating a reduction in the
number of locks required for the client, both to make the
implementation clearer and to reduce or even remove the possibility of
deadlocks. (E.g, the recent issues around actually closing
subscriptions before closing the sessions).

4. Acks

Currently there are a number of issues with our acknowledgement
generation process that means we are fairly non-compliant with the JMS
specification, and the reliability guarantees people are expecting may
not be met as a result.


Robbie

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to