2008/8/26 Martin Ritchie <[EMAIL PROTECTED]>:
> Hi,
>
> Just raised a bug as a result of a CI failure for the 
> SyncWaitTimeoutDelayTest.
>
> It appears to me to be a protocol bug anyone fluent in 0-10 able to
> say if the bug is also in 0-10?
>
> Is there going to be a 0-9 update that might address this?
>
> https://issues.apache.org/jira/browse/QPID-1262
>
> The problem in a nutshell:
>
> TxCommitOk is not correlated with the TxCommit that initiated the work
> on the broker.
> So if our broker takes a long time (using SlowMessageStore) to perform
> commit and client times out the wait for the TxCommitOK (as in the
> SyncWaitTimeoutDelayTest) then it is possible that if a subsequent
> TxCommit is sent that the TxCommitOk that is returned signals the wait
> by mistake.
>
> AMQP Method Sequence:
> [C]lient
> [B]roker
> [S]end
> [R]eceive
>
> CS: TxCommit  (a)
> BR: TxCommit  (a)
> // Broker takes a lot of time
> // Client times out waiting for TxCommit (a)
> CS: TxCommit  (b)
> BS: TxCommitOk (a)
> CR: TxCommitOk  (a)
> // At this point the the client thinks that its commit (a) has
> succeeded, it hasn't.
>
> My only thoughts were
> a) add correlation ids to the TxCommit TxCommitOk pairs, as was done
> above for clarity in the explanation.
> b) close the session in the event of a timeout and re-establish session.
>

Option b) is the only safe alternative for 0-8/0-9.

Completion of commands is correlated in 0-10 so this is no longer an issue...

-- Rob

> thoughts?
> --
> Martin Ritchie
>

Reply via email to