Re: Failover

Rajith Attapattu Thu, 29 Sep 2011 07:59:52 -0700

On Thu, Sep 22, 2011 at 12:19 PM, Oleksandr Rudyy <oru...@gmail.com> wrote:
> Thanks Rajith for your commentaries.
> I have discussed them with Robbie, our comments inline.
>
>>> Qpid Java Client Failover Policy
>>>
>>> 1. Qpid client failover basic principles.
>>> ===================================================
>>>
>>> When connection to broker is lost a Qpid client should be able to
>>> re-establish connection to broker if failover policy is not switched
>>> off by specifying "nofailover" as a failover option in a connection
>>> URL.
>>
>>> The failover functionality on Qpid client should be based on
>>> principle "stop the world".  When connection is lost and failover
>>> starts the Qpid Client should not allow an invocation of any JMS
>>> operation which requires sending or receiving data over the network.
>>> Such operations should be blocked until failover functionality
>>> restores the connectivity by using any of the supported failover
>>> methods ('singlebroker', 'roundrobin', 'failover_exchange').
>>
>> I also think it would be useful if we could provide a brief write up
>> about the different failover methods we supports and exactly what can
>> be expected.
>> We also need to clearly clarify things like reconnect_timeout,
>> reconnect_limit and reconnect_interval etc and should align ourselves
>> with the C++ and Python (Ruby...etc)  in terms of behaviour of these
>> options.
>>
>
> How particular values like reconnect interval etc are attained/used
> seems like it can be addressed separately from the messaging behaviour
> of the client during failover, and so could later be addressed
> separately.


That is fine. But I wanted to ensure that we don't forget that aspect.
Bcos we need that in order to provide a consistent and complete
failover experience.

>>> On restoring connectivity blocked JMS operations should be allowed
>>> to finish. If the failover functionality cannot re-establish the
>>> connection a JMSException should be thrown within any JMS
>>> operation requiring transferring data over the network.
>>
>> We need to come up with a clear strategy of notifying exceptions, an
>> issue where we are falling short in the current implementation.
>> If a connection listener is used then we have to notify via the
>> connection listener as per the spec.
>>
>> Then comes the question of what are we going to do with a JMS
>> operation that transfers data over the network.
>> Are we still going to throw an exception as mentioned above? or are
>> we going to throw an exception IFF there is no exception listener ?
>>
>
> Looking at the JMS spec section for ExceptionListener, it is quite
> clear about intended behaviour in that scenario:
>
> "A JMS provider should attempt to resolve connection problems itself prior to
> notifying the client of them.
>
> The exceptions delivered to ExceptionListener are those which don’t have any
> other place to be reported. If an exception is thrown on a JMS call it, by
> definition, must not be delivered to an ExceptionListener (i.e.
> ExceptionListener is
> not for the purpose of monitoring all exceptions thrown by a connection)."

Agreed. Just wanted to highlight that the current behaviour is wrong
according to your description. I have an open JIRA for this QPID-3289

The wrong behaviour in notifying the exception in two ways caused a
deadlock. We have managed to side step the deadlock issue, but don't
have a proper fix there.

>> Next up, If we are throwing a JMS exception, how will the application
>> know the difference between a session exception and a connection
>> exception ?
>> Ex.  Resource limit exceeded vs Connection issue ?
>>
>
> We would have to decide on either a hierarchy of exceptions, or
> perhaps an error-code based solution, or both, in order to convey any
> meaning in such situations.

I think we need to come up with a proper Qpid error code structure
that is protocol version independent. The JMSException allows the
ability to specify a vendor error code.

I have to add that the current AMQException and the AMQConstant error
codes are a failure.
So I suggest we come up with something meaningful. I'll take a stab at this.

>> In the current code, this is an area which is causing deadlocks and
>> race conditions. This also an area that is causing confusion among
>> our users as they are not sure how exactly the client is going to
>> behave.Therefore IMO I believe it's imperative to fix our exception
>> handling if we are to provide our users with deterministic failover
>> experience.
>
> Agreed, see later for more discussion.
>
>>> On successful failover, it is expected that client JMS session should
>>> restore all temporary queues(topics) if such were created before
>>> failover.
>>
>> This is not always the case and depends on what reliability gauntness
>> are specified on a per destination basis via the "reliability" option.
>> If "unreliable" or "at-most-once" is used, then I believe then the
>> above is fine.
>>
>> If "at-least once" is used then we cannot simply create a new
>> temporary queue, as btw the time the client failed over and
>> reconnected, there may have been messages sent to the previous
>> temp queue.
>> If we simply create a new temp queue then those messages on the
>> old queue will be lost.
>> This is certainly the case with the C++ broker which supports
>> clustering and with both brokers in the event of a temporary network
>> failover.
>>
>
> By restore, we meant what the client currently appears to already do
> with TemporaryQueues, create exclusive auto-delete queues of the same
> name.
>
> Since TemporaryQueues are created by the
> Session.createTemporaryQueue() call there doesnt appear to be any
> option to specify syntax such as link-options. It also doesn't seem
> clear that we could guarantee at-least-once delivery for a temporary
> queue which is deleted automatically when the underlying
> Session/Connection is closed anyway, so perhaps a reasonable stance to
> take would be that at-least-once on TemporaryQueues is not supported
> across Failover?

Apologies I should have made it clear. I incorrectly interpreted your
mention of temp queues as Topics rather than
javax.jms.TemporaryQueues/Topics. Our Topics are temp queues marked
auto-delete & exclusive (see AMQTopic). My comments about reliability
was in that context. The Address based implementation is different all
though doesn't go as far enough as it should. My upcoming work for
QPID-3401 will address that.

In general Topics should be supported across failover (unless
explicitly marked unreliable). The private queue used for a Topic
should only be closed when the application closes the Session or the
MessageConsumer, whichever happens first.

I also think that it's reasonable to say that TemporaryQueues and
TemporaryTopics should not be supported across failover. However as
you pointed out the spec says their life time is tied to that of the
respective JMS connection.
Since failover is transparent to the JMSConnection, one could argue
that TemporaryQueue/Topics should be supported across failover as they
are tied to the lifetime of the JMS connection, not the underlying TCP
(or AMQP) connection to the broker.
But I'm willing to be convinced either way on this.

>>> 2. Description of failover behavior for JMS operations
>>> ==================================================
>>>
>>> Session.commit() rollbacks transaction and throws a
>>> TransactionRolledBackException if the session is dirty (there were
>>> messages sent/received in the transaction) when failover occurred,
>>> allowing the user to replay their transaction on the new Session.
>>
>> We need more clarification here.
>> 1. Are we going to allow the same transacted session to be used just
>> like the way we allow non transacted JMS sessions? In other words
>> are we going to allow the JMS Session to resume once the exception
>> is handled?
>> 2. Or do we ask the application to create a new JMS session before
>> continuing with the work ?
>>
>> Currently on 0-10 path we do #2, but there are customer requests to support 
>> #1.
>
>
> #1 is what we meant, and seems like the only reasonable thing to do in
> this situation.

Agreed and there are users explicitly requesting for this. So I fully
support it.

>>
>>> Session.recover() and Message#acknowledge() should throw a
>>> JMSException if there has been any message delivery since the last
>>> acknowledgment was sent and failover occurred whilst the session
>>> was 'dirty'
>>
>> Perhaps here we may have to make a distinction between, 1. Messages
>> that were in the prefetch buffer but not yet delivered to the
>> application 2. Messages that were delivered to the application but not
>> yet acked (unacked message buffer).
>>
>> AFAIK we currently mark both categories of messages are redelivered.
>> There has been some push back on this, asking us to mark only the
>> delivered but unacked messages as "redelivered".
>>
>
> By delivered we meant messages handed to the application and not any
> prefetched messages, which would not be considered to make the session
> dirty.

Excellent ! we are on the same page here.

> After further consideration of recover(), it seems like it is not
> necessarily possible to meet the contract for the method across
> failover, even with a cluster. More later when discussing Client Ack.
>
>>> Message consumer:
>>>
>>> No further messages sent to the client by the old broker should be
>>> received by the consumer after failover has occurred, only messages
>>> sent by the new broker should be available to consumers.
>>
>> Agreed, but pls see the above comment.
>
> In this case, prefetched messages were included in the 'sent to the
> client' group since we cant make use of those because we wont be able
> to acknowledge them, and the broker may have also sent us those
> messages again (if they still exist after we reconnect).
>
>>
>>>
>>> Queue browser:
>>>
>>> If failover occurred while iterating through QueueBrowser
>>> enumerations a sub-class of NotSuchElementException should be
>>> thrown by default.
>> Agreed.
>>
>>>
>>>
>>> 3. Issues with acknowledgments
>>> ==================================================
>>> 1. CLIENT_ACKNOWLEDGE
>>>
>>> The acknowledge operation should not return till all messages has
>>> actually been acknowledged. Currently is possible for messages not
>>> being acknowledged after invoking acknowledge operation. The
>>> acknowledge is done lazily by acknowledgment flusher, however,
>>> this is not what the JMS spec requires.
>>
>> I believe we need to handle 3 cases during failover.
>> I have provided a detailed analysis of the above in QPID-3462
>>
>
> For the 1st of your cases, the acknowledement can only be in doubt if
> the acknowledge method fails; if the method returns there should be no
> doubt. As you concluded, this should always be a synchronous
> operation, which currently it isnt since half the time it doesnt even
> send an acknowledgement at all.

We are in agreement here.

> For the second case, the acknowledge call should throw an exception as
> you indicated. Although it is possible, it isnt guaranteed the next
> message delivered will be the same message since it could have been
> given to another client whilst reconnecting during failover, could
> have been transient and lost on a restart, or in case non-clustered
> brokers just may not be there after failover. The next message after
> failover would just be the first message provided by the new broker to
> the client.

Agreed. When I said the 'same message' there was a lot of assumptions
like single client, clustered broker ..etc

But considering all cases I agree with your description.

> In the third case we somewhat disagree based on the example. After
> failover occurs, we cant simply give the application the previous
> message again. As mentioned above, this may not be a message that is
> available any more, and since we know we cant acknowledge messages
> received from prior to failover anyway it doesnt make sense to give it
> to the application a second time and then throw an exception upon
> acknowledgement. Perhaps the example is assuming the broker will
> always send the same message again first? Also, it isnt clear what
> behaviour this is implying if there were e.g. 3+ messages received in
> the same acknowledge window.
>
Sorry I should have clearly stated the assumptions there. This was
based on a test case I had in mind. Single client and a clustered
broker setup.  So no message loss and no messages being delivered to
any other client. In that case the broker start message delivery again
from the oldest unacked message.

Ex For the above case, If the app got 3 messages (msg1, msg2 & msg3)
and failover happened while trying to ack on the 4th. Then after
failover over the broker should start message delivery again from
"msg1" .   Messages msg1-4 should be marked redelivered.

Is that agreeable ?

In other cases (where there are multiple consumers, or message loss
due to total broker crash ..etc) then the broker will restart delivery
with the next available message.

> If the session is dirty during failover in Client Ack mode, we need to
> simply drop knowledge of the previous messages given to the client by
> the broker (prefetched or delivered to the app) but retain knowledge
> of whether the session was dirty during failover. We also have to
> accept that as not all the messages may still be available to the
> client after failover then we cant guarantee to meet the contract for
> recover() across failover either (looking at the code, it looks like
> we currently cant meet it regardless of failover). The question for us
> then is really how we convey that to the application, possibly
> depending on what it does next. We could use the exception listener
> [if there is one] at the point the failover occurs to indicate
> failover happened, or we could wait and throw an exception on
> acknowledge() / recover() [whichever is called, since only one can be
> on the same dirty session before it is then clean again] to indicate
> we cant meet the expected behaviour due to failover, or we could throw
> an exception on receive() if we knew such calls couldnt deliver
> messages that could cleanly be acknoweldged due to failing over with
> previously unacked messages (but the client may only be using
> onMessage and so not get such exceptions). Alternatively, you could
> argue that recover() should never throw an exception since it cant
> never be guaranteed to work, and so that could just be documented
> behaviour without an exception. The sticking point with all of the
> above is what would be considered most appropriate/helpful from the
> client application perspective.
>
> Related to this, when using transactions similar questions arise
> around rollback(), commit(), send() and receive(). If the session is
> dirty at the point failover occurs, it is known that the first
> subsequent commit() should cause a rollback followed by
> TransactionRolledBackException. However, it is also known that any
> send() or receive() calls before such a commit() are irrelevant as
> that work will rolled back, so should we throw an exception at this
> point to indicate that. Currently the 0-9 codepath does this on
> send(), but the 0-10 client doenst and neither do it on receive(). For
> rollback, it can be argued it should never throw an exception due to
> what it is doing, although applications calling rollback might expect
> the same messages to be received and that may not be the case due to
> the failover so it could instead be argued that like recover() an
> exception should be raised.
>
>
>>> 2.AUTO_ACKNOWLEDGE
>>>
>>> The JMS requires that after each call to receive operation or
>>> completion of each onMessage the received message should be
>>> acknowledged. Currently this does not happens as the acknowledge is
>>> done lazily by acknowledgment flusher which does not give the same
>>> guarantee. This is in fact is DUPS_OK_ACKNOWLEDGE behavior. The
>>> flusher thread should not be running for AUTO_ACKNOWLEDGE.
>>
>> In order to be spec compliant applications need to use -Dsync_ack=true
>> which will force the client to sync with the broker after each
>> onMessage or receive call.
>> However this is terribly slow, hence the reason why it behaves like DUPS_OK.
>>
>> I don't necessarily condone this behaviour (neither did I code this
>> part),  but making it spec compliant might suddenly make a lot of
>> applications crawl :D So we need to be very careful here.  If we do we
>> need to make a lot of noise about this in release notes etc...
>>
>> The ack-flusher was done to periodically flush message completions
>> during periods of relatively slow message flows.
>> If we fix AUTO_ACK then I agree this does not need to run during AUTO_ACK.
>>
>
> In the 0-8 client, it sends acks asynchronously but at least does it
> on a per-message basis rather than batching them up for X period of
> time. We could perhaps investigate that behaviour for the 0-10 client
> if not actually making full on synchronous acking the default.

I fully agree on doing this without having perf impact on the existing
applications.

>>> 3. SESSION_TRANSACTED
>>> Flusher thread is started but it is not really needed because
>>> acknowledgment for transacted sessions is handled differently.
>>> It seems that a flusher thread make sense to run in a
>>> DUPS_OK_ACKNOWLEDGE modes only.
>>
>> This is actually a bug. As you say we only need to run it during
>> DUPS_OK and AUTO_ACK (until auto-ack gets fixed).
>>
>
> (Some commentary about transacted sessions was given above in the
> client ack section along with related comments)
>
>>>
>>> 4. NO_ACKNOWLEDGE
>>>
>>> The flusher thread should not be running for NO_ACKNOWLEDGE.
>>
>> We need to clearly define how the JMS ack modes will work along
>> side the "reliability" option in address strings.
>
> Agreed. In requesting NO_ACK mode it seems fairly clear you didnt want
> to ack messages by using this custom acknowledge mode, so if that
> conflicts badly with any selected link options such as at-least-once
> it seems it would warrant an exception being raised (i.e. the JMS
> Acknowledge Mode selected in the code takes precedence).

Whatever we decide to do, we need to document clearly to ensure our
users are not confused. I think it's reasonable to make the JMS ACK
mode take precedence.

>>
>>>
>>> 5. PRE_ACKNOWLEDGE
>>>
>>> The same issue as with AUTO_ACKNOWLEDGE.
>>> Is anybody using this mode? Does it make sense to keep it?
>>
>> I don't know if anybody is using it. I wonder whats the use case behind
>> this.
>
> I'm not sure anyone does. It acknowledges before onMessage() rather
> than after, which in the case of the 0-10 codepath makes little
> difference by default since auto-ack, dups-ok, and pre-ack all
> currently depend on the flusher thread to do the acknowledements.
>
>
>
>
>
> As a little aside, whilst investigating the various options for
> Failover, I found the following document describing Oracle Weblogic
> failover policy at
> http://download.oracle.com/docs/cd/E13222_01/wls/docs92/jms/recover.html
> which seems useful to consider. Investigating what some other vendors
> do with their JMS client recovery seems like it could be useful.
>
>
> Also, I created a number of tests to cover failover functionality but
> I can not commit them till failover behaviour is finalized.
>
> Kind Regards,
> Alex
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscr...@qpid.apache.org
>
>

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Re: Failover

Reply via email to