On 10/28/2011 03:39 PM, Andrew Kennedy wrote:
On 28 Oct 2011, at 15h19, Gordon Sim wrote:
On 10/28/2011 03:17 PM, Rob Godfrey wrote:
On 28 October 2011 15:12, Gordon Sim<[email protected]>   wrote:

On 10/26/2011 08:38 PM, Carl Trieloff wrote:

Use queue state replication and client failover via the address or
failover exchange.


A few words of warning... as it stands there are a few operational issues
with that 'feature' you should be aware of:

* if links are down replication events can back up very fast and there is
no way to manage this
* resuming replication is impossible if any events are lost
* its a little cumbersome to configure

Good to know.

[I've been experimenting with a much nicer design proposed by Ted. I have a
working patch but its unlikely to get committed anytime soon]


Is this something that would also be useful to implement in the Java
Broker...

Yes

+1, definitely.

is there a public design document for it?

No, not yet. As I say its an experimental patch only at this stage, primarily 
because I was intrigued by the idea. I'm only mentioning it here as it fixes 
the issues with the existing queue replication very effectively.

I'd be very interested in looking at this, if you could supply more details 
about your design, Gordon.

The core idea is to use a browsing subscription as the means to replicate the message content. This requires that browsers see all messages, including those that are acquired.

In addition you need to send dequeue events. I do this on the same subscription using messages with routing key 'dequeue_event' with an encoded AMQP 0-10 sequence set as their body. The messages being dequeued are identified by a sequence number assigned by the queue. Allowing multiple dequeues to be combined in a single message helps prevent build up of messages for a slow consumer (or a consumer trying their best when the queue is heavily loaded!).

This approach means that when the replicating subscription is not active, there is no build up of state for replication. When resubscribing, the subscription can include their current sequence number (i.e. the last one assigned) and the sequence number of the oldest message. This allows the source queue to generate any necessary dequeue events to keep the two in sync and start the browsing at the correct position.

(Contrast with the current solution where there is a steadily growing queue of events that are not coalesced in anyway and whose deletion tends to render it impossible to resume replication).

The completion of published messages is delayed until either any replicating subscriptions have themselves confirmed receipt or until the message is dequeued. Likewise completions of accepts covering messages delivered from that queue should be delayed until any replicating subscriptions have confirmed receipt of the dequeue notifications (my prototype doesn't yet do this, accepts complete immediately).

This gives synchronous replication as far as the client is concerned, but doesn't delay delivery of messages until replicated so has little impact on latency.

My prototype uses a quick-and-dirty hack to configure this. A special pattern in the exchange name (starts with 'qpid.replicator-') on a federation route identifies the route as being created for queue replication (which causes the necessary arguments to be set on the subscribe). Though the implementation (and tooling) would be done differently in a real solution, I believe it would be as simple to setup (which is a fair bit simpler than at present). A nice bonus is that you don't have to signal the desire to replicate when first creating the source queue.

Happy to send the patch if you want to play around with it.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to