Hi Cyril, Sebastian,

On Sun, Feb 12, 2012 at 10:49:26PM +0100, Cyril Bonté wrote:
> Le 12/02/2012 22:31, Sebastian Fohler a écrit :
> >I've noticed that too.
> >The problem is, when I try to reach the backendservers themselves (they
> >are all reachable by there own name, adserve1/adserver2/...). The don't
> >show any problems at all, the question in that case is, how do I find
> >out which error they throw, seen from the lb end.
> >Sure I have checked that, and a ping from the lb server to the backend
> >system is without any trouble even when the haproxy frontend tells me,
> >they are not.
> 
> Ping doesn't try to open a new tcp socket, which is the issue you have.
> From the lb, try to send a http request on the faulty backend (with 
> curl for example) when it is detected down, probably you'll see the same 
> issue, or it could be slow.
> 
> Maybe your ephemeral port range is too short.
> Can you provide some sysctl values such as net.inet.ip.portrange.* ?

I suspect that it's worse. I've read all the thread, and to me it looks
like it's some firewall blocking outgoing connections because a state
table is full. And if the problem only happens when haproxy is running,
the firewall likely is on the same machine. That's a common issue when
running firewalls on proxies because the connections are more than doubled
since the proxy is two-sides. And generally it happens exactly as described
in the thread, CPU rises a bit then nothing passes (including checks) while
the servers appear to be up when manually tested.

Sebastian, do you have a way to check that on the haproxy machine ? We'd
need stats on the number of sessions and kernel messages to see if anything
reports a state table full.

Regards,
Willy


Reply via email to