Hi Vincent, What's odd is that if I failover all virtual IPs to one server and set net.ipv4.ip_nonlocal_bind=0 on that server the issue goes away. The issue remains "fixed" when I fail half of the virtual IPs back to the secondary server and set net.ipv4.ip_nonlocal_bind=1. However, after a reboot of both servers the initial behavior comes back. This seems to be something related to the way the 2.6.32 kernel handles net.ipv4.ip_nonlocal_bind and how it relates to the sockets' file descriptors.
The logs don't show anything suspicious. When a reload is successful I see the expected output in the logs: Oct 30 09:49:53 127.0.0.1 haproxy[26191]: Proxy haproxy-stats started. Oct 30 09:50:22 127.0.0.1 haproxy[26192]: Pausing proxy haproxy-stats. Oct 30 09:50:22 127.0.0.1 haproxy[26215]: Proxy haproxy-stats started. Oct 30 09:50:22 127.0.0.1 haproxy[26192]: Stopping proxy haproxy-stats in 0 ms. Oct 30 09:50:22 127.0.0.1 haproxy[26192]: Proxy haproxy-stats stopped (FE: 0 conns, BE: 0 conns). When a reload is unsuccessful the code that pauses, starts a new proxy, and stops the original proxy isn't called so there is no output in the logs. Instead the Alert (cannot bind socket) is sent to stderr and is logged by consul-template. I'm going to compile the 3.10 kernel from CentOS 7 for CentOS 6 and see if the behavior persists and report back. Thanks, Chris On Fri, Oct 30, 2015 at 3:04 AM, Vincent Bernat <[email protected]> wrote: > ❦ 30 octobre 2015 00:34 -0400, Chris Riley <[email protected]> : > > > The kernel version is 2.6.32-358.23.2.el6.x86_64, the OS is CentOS > > 6.4. > > With this version of the kernel, the previous instance of HAProxy has to > release the port before the new one can bind. It seems that in your > case, this doesn't happen. Nothing suspicious in the logs of the > previous instance? > -- > Let us endeavor so to live that when we come to die even the undertaker > will be > sorry. > -- Mark Twain, "Pudd'nhead Wilson's Calendar" >

