*From: *Willy Tarreau <[email protected]> *Sent: * 2014-01-25 05:45:11 E *To: *Patrick Hemmer <[email protected]> *CC: *Malcolm Turnbull <[email protected]>, [email protected] <[email protected]> *Subject: *Re: Just a simple thought on health checks after a soft reload of HAProxy....
> On Tue, Jan 21, 2014 at 09:04:12PM -0500, Patrick Hemmer wrote: >> Personally I would not like that every server is considered down until >> after the health checks pass. Basically this would result in things >> being down after a reload, which defeats the point of the reload being >> non-interruptive. > I can confirm, we had this in a very early version, something like 1.0.x > and it was quickly changed! I've been using Alteon load balancers for > years and their health checks are slow. I remember that the persons in > charge for them were always scared to reboot them because the services > remained down for a long time after a reboot (seconds to minutes). So > we definitely don't want this to happen here. > >> I can think of 2 possible solutions: >> 1) When the new process comes up, do an initial check on all servers >> (just one) which have checks enabled. Use that one check as the verdict >> for whether each server should be marked 'up' or 'down'. > Till now that's exactly what's currently done. The servers are marked > "almost dead", so the first check gives the verdict. Initially we had > all checks started immediately. But it caused a lot of issues at several > places where there were a high number of backends or servers mapped to > the same hardware, because the rush of connection really caused the > servers to be flagged as down. So we started to spread the checks over > the longest check period in a farm. Is there a way to enable this behavior? In my environment/configuration, it causes absolutely no issue that all the checks be fired off at the same time. As it is right now, when haproxy starts up, it takes it quite a while to discover which servers are down. -Patrick

