Re: Suggestions for Broker fault tolerance on Windows platform?

Alan Conway Thu, 06 May 2010 07:16:01 -0700

On 05/05/2010 03:22 PM, Kerry Bonin wrote:

I like the idea of generalizing the failover scheme.  Any suggestions
on where to start?  An externally configurable list of brokers
published to amq.failover would seem an appropriate first step, and
then work on mechanisms to update those lists.  I'm looking over
failover examples now...

Look at cluster/FailoverExchange.h/cpp. The cluster plugin creates & registersan instance of this exchange. I think what you describe above is as simple as :- Implement FailoverExchange::route() (currently a no-op) to parse the host listin the message and call updateUrls()

- Write a plugin that creates and registers the failover exchange.

- Publish updates to amq.failover and the existing failover mechanism will dothe rest

This raises a config issue: it would make the existing cluster plugin and thekerrys-cluster plugin mutually incompatible so we can't put them both in adefault plugin directory that is auto-loaded. That's a separate concern, I'veraised QPID-2571 and QPID-2572 for that


On Wed, May 5, 2010 at 1:39 PM, Alan Conway<[email protected]>  wrote:

On 05/05/2010 02:04 PM, Kerry Bonin wrote:


I've been walking down a rat hole and thought I'd take a moment to ask
the list how I should proceed - I could use some advice from someone
with deeper QPID experience...

My requirement is relatively simple - I've got a number of clients
using QPID , all currently Windows platforms, with clients in C++,
Python, and Java.  We need the system to handle broker process / host
failure, preferably in a manner transparent to the client applications
(although we can insert code between the client applications and the
QPID client library.)

We can run multiple brokers, and had planned on deploying a broker on
each server box, and wanted at least some sort of failover solution.
Our problem - how to proceed on Windows?

QPID Clustering is built over Corosync, which I believe limits it to a
subset of *nix platforms.

Federation with failover would work, but Federation doesn't work on
Windows.  (I'm still looking at QPID-2199...)

Failover code in progress appears tied to clustering, which is out for
Windows.

Should I just roll my own standalone failover, forking if necessary to
expose lower features?  (I don't want to do this!)  Can I leverage the
Failover code in the trunk and manage amq.failover without the
Corosync dependent code?  Are there any simpler approaches I'm
missing?

Comments greatly appreciated...


The client side of failover doesn't involve corosync at all. Clients
subscribe to the amq.failover exchange to get updates on the list of brokers
they can use for failover, and if they detect a connection failure they
iterate over this list till they get a connection. All this is standard
AMQP.

The cluster plugin on the broker generates those updates based on corosync
membership changes. If you had an alternate plugin generating membership
updates then it could drive the existing client failover with no changes on
the client. Of course this assumes some non-corosync source for the
membership info but you'd need that anyway in a roll-your-own scheme.

I'd be happy to help out if you're interested in implementing something like
this. Generalizing the failover-list notification scheme is preferable to
forking and I think will be useful in other contexts as well.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Re: Suggestions for Broker fault tolerance on Windows platform?

Reply via email to