Hi Cyril, Sebastian, On Sun, Feb 12, 2012 at 10:49:26PM +0100, Cyril Bonté wrote: > Le 12/02/2012 22:31, Sebastian Fohler a écrit : > >I've noticed that too. > >The problem is, when I try to reach the backendservers themselves (they > >are all reachable by there own name, adserve1/adserver2/...). The don't > >show any problems at all, the question in that case is, how do I find > >out which error they throw, seen from the lb end. > >Sure I have checked that, and a ping from the lb server to the backend > >system is without any trouble even when the haproxy frontend tells me, > >they are not. > > Ping doesn't try to open a new tcp socket, which is the issue you have. > From the lb, try to send a http request on the faulty backend (with > curl for example) when it is detected down, probably you'll see the same > issue, or it could be slow. > > Maybe your ephemeral port range is too short. > Can you provide some sysctl values such as net.inet.ip.portrange.* ?
I suspect that it's worse. I've read all the thread, and to me it looks like it's some firewall blocking outgoing connections because a state table is full. And if the problem only happens when haproxy is running, the firewall likely is on the same machine. That's a common issue when running firewalls on proxies because the connections are more than doubled since the proxy is two-sides. And generally it happens exactly as described in the thread, CPU rises a bit then nothing passes (including checks) while the servers appear to be up when manually tested. Sebastian, do you have a way to check that on the haproxy machine ? We'd need stats on the number of sessions and kernel messages to see if anything reports a state table full. Regards, Willy

