On Tue, Apr 13, 2010 at 06:47:04PM +0200, Emmanuel Bailleul wrote: > > -----Message d'origine----- > > De : Static Void [mailto:[email protected]] > > Envoyé : mardi 13 avril 2010 16:25 > > À : [email protected]; [email protected] > > Objet : Question on healtchecks > > > > I have an active-passive HAProxy setup using keepalived, similar to > > this: > > http://www.howtoforge.com/setting-up-a-high-availability-load-balancer- > > with-haproxy-keepalived-on-debian-lenny > > > > My question is, is there any way to have the healtchecks performed on > > only the active HAProxy? Currently both the active and passive HAProxys > > ping my servers every 3 seconds at (almost) the same time. I have > > enabled the spread-check option with a value of 5 but it seems to make > > little difference. > > > > Is there anything I can do to remedy this problem? Thanks! > > Hi, > > With keepalived you have the option to run a script on certain events > (transition from master to backup, from backup to master, ...). So why not > just fire up haproxy on the backup machine when it becomes master ?
Well, anyway in my opinion, the original question is wrong. It is important for the second LB to be aware of the servers' health because you want it to be immediately operational in case of a LB failure. You would not want to have it use wrong servers or take some time to discover which ones are OK and which ones aren't. Also, having an LB start only upon failover is really not practical at all, as it increases the failover time, and it can make it harder to debug issues. Imagine if your LBs are starting to flapping and the service is constantly started and stopped. It completely voids the initial point of putting them in high availability. Last, you're saying that both of your LBs send a check every 3 seconds, which means that your servers receive on average a check every 1.5 seconds. If this load is even minimally perceivable on the servers, then you don't need a load balancer, you need to rewrite the application, because you'll never have any visitor satisfied with the response time ! I'm aware of some people doing checks every 20 ms (50 checks per second per server and per LB) in order to speed up error detection in critical environments. The servers then receive 100 checks per second and barely notice them. I don't suggest going that low, it's just to illustrate that this cannot be a problem. In fact the only problem related to the check interval generally is the timeout because some in-depth tests may sometimes involve many components which sometimes require a bit more time to complete a test. It's trickier to adjust (using "timeout check") but still doable. But clearly 2 checks every 3 seconds have no reason to be any sort of a "problem". Regards, Willy

