On 06/14/2013 03:58 PM, Kerry Bonin wrote:
On existing broker failover - can you point me to where that behavior is documented? Because neither myself or anyone on the four teams I work with has come across the functionality you describe. I've never seen a client failover to another broker, only code to attempt to reconnect.
It appears the reconnect_urls connection option is not in fact documented. Sorry about that. It takes a single url or a Variant::List of urls to try when reconnecting.
Basic features we need: - externally adjustable retry / timeout on connections - to handle differences between LAN, WAN, and satellite internet. - updating broker list: How do you do this? Never seen it...
There are two options. The first is that any url in the AMQP 0-10 format can itself contain multiple hosts, e.g. amqp:tcp:host1:port1,host2:port2. The second is to use the reconnect_urls option as above.
(When used in conjunction with the failover exchange there is a helper class that will receive updates and apply them: http://qpid.apache.org/books/0.20/Programming-In-Apache-Qpid/html/ch02s14.html, something similar could be done for some other distribution mechanism).
- to prevent network splits, how are recovered brokers monitored? When a failed broker recovers, do clients switch back? How often / aggressively checked?
No, there is no switch back behaviour in the client. The new HA code allows a broker to be classed as in a backup or primary role and backups will reject or kick off any clients causing them to failover. Whatever cluster management solution was in use would then detect changes to primary and use QMF to tell each broker what their role was.
- how is the application notified on broker failure, connection failover, recovery?
It isn't. Any threads using the connection will essentially block until either the connection was re-established or until the configured limit was reached and the client gives up trying.
Now I write this I do recall a conversation on this topic with you some time back, with this being an issue for you.
Finally, we were ending up with LOTS of application complexity in SOA code when broker failure / recovery meant connection, sender and receiver objects had to be recreated. This was compounded by Connection being a different types of boost object than senders and receivers.
That's strange. Those classes all use the Handle template so I can't see how they would be different in that regard. I don't suppose you recall the details?
And anything you can think of for dynamically load balancing across brokers?
Honestly, I think the simplest solution overall is for us to get federation working on windows. I assume its some issue in the IO layer. Does anyone have a concrete understanding of what the problem is and what is required to fix it?
Any volunteers from our windows experts to take a look (Cliff, Chuck, Andrew, Steve)?
Greatly appreciate the feedback and input...
Likewise! --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
