Re: Qpid post-mortem and request for suggestions for (my) next release challenge (10M msgs/sec on Windows)

Gordon Sim Mon, 17 Jun 2013 05:27:15 -0700

On 06/14/2013 03:58 PM, Kerry Bonin wrote:

On existing broker failover - can you point me to where that behavior is
documented?  Because neither myself or anyone on the four teams I work with
has come across the functionality you describe.  I've never seen a client
failover to another broker, only code to attempt to reconnect.

It appears the reconnect_urls connection option is not in factdocumented. Sorry about that. It takes a single url or a Variant::Listof urls to try when reconnecting.

 Basic
features we need:
- externally adjustable retry / timeout on connections - to handle
differences between LAN, WAN, and satellite internet.
- updating broker list: How do you do this?  Never seen it...

There are two options. The first is that any url in the AMQP 0-10 formatcan itself contain multiple hosts, e.g.amqp:tcp:host1:port1,host2:port2. The second is to use thereconnect_urls option as above.

(When used in conjunction with the failover exchange there is a helperclass that will receive updates and apply them:http://qpid.apache.org/books/0.20/Programming-In-Apache-Qpid/html/ch02s14.html,something similar could be done for some other distribution mechanism).

- to prevent network splits, how are recovered brokers monitored?  When a
failed broker recovers, do clients switch back?  How often / aggressively
checked?

No, there is no switch back behaviour in the client. The new HA codeallows a broker to be classed as in a backup or primary role and backupswill reject or kick off any clients causing them to failover. Whatevercluster management solution was in use would then detect changes toprimary and use QMF to tell each broker what their role was.

- how is the application notified on broker failure, connection failover,
recovery?

It isn't. Any threads using the connection will essentially block untileither the connection was re-established or until the configured limitwas reached and the client gives up trying.

Now I write this I do recall a conversation on this topic with you sometime back, with this being an issue for you.

Finally, we were ending up with LOTS of application complexity in SOA code
when broker failure / recovery meant connection, sender and receiver
objects had to be recreated.  This was compounded by Connection being a
different types of boost object than senders and receivers.

That's strange. Those classes all use the Handle template so I can't seehow they would be different in that regard. I don't suppose you recallthe details?

And anything you can think of for dynamically load balancing across brokers?

Honestly, I think the simplest solution overall is for us to getfederation working on windows. I assume its some issue in the IO layer.Does anyone have a concrete understanding of what the problem is andwhat is required to fix it?

Any volunteers from our windows experts to take a look (Cliff, Chuck,Andrew, Steve)?

Greatly appreciate the feedback and input...


Likewise!

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Qpid post-mortem and request for suggestions for (my) next release challenge (10M msgs/sec on Windows)

Reply via email to