Re: excessive socket failures

Willy Tarreau Thu, 27 Sep 2012 23:46:14 -0700

On Fri, Sep 28, 2012 at 08:30:11AM +0200, Fred Leeflang wrote:
> On 09/28/2012 08:12 AM, Willy Tarreau wrote:
> >What happens sometimes is conntrack is loaded with default settings in 
> >the hypervisor, limiting the connection rate to a very low throughput 
> >once all ports have been used. However bitrate is not affected of course. 
> 
> I've asked our sysadmin to remove the conntrack module altogether, from 
> googling this, this seems to be the most adequate solution.


OK.

> >>I've also done an iperf test from the second lb's interface to the first
> >>lb's interface (both are on separate physical machines) and this results
> >>in a throughput of 941Mbits/s.
> >OK so at least we can say that the physical network works well.
> >
> >In your logs I'm seeing that your nginx server responds in roughly 
> >50-100ms,
> >and that you have around 10 concurrent connections on the frontend max. 
> >This
> >means around 100-200 connections per second max. It would thus be possible
> >that you're limited there (or by the number of concurrent conns sent by 
> >siege).
> I just ran siege from the internal network to haproxy first; It would 
> seem that the issue doesn't happen here (earlier tests were to the 
> external IP on the BSD firewall, this one to the 10.x.x.x interface), so 
> it might be that the BSD firewall is causing issues here?

Yes that's very possible indeed. This reminds me old memories when I was
running OpenBSD on my VAX, I had pf block some connections after the ports
rolled over, I *believe* it was because it was adding too much random on
sequence numbers, not considering the source port, an causing the random
SYNs to be dropped by the server during the TIME_WAIT state. But these are
old memories, I can be wrong.

If you're having some masquerading at home, it is also possible your
firewall is abusively reusing the same source port with an inappropriate
sequence number, causing the other firewall to drop the packets, considering
they're old duplicates. So a tcpdump would definitely help there.

> >- update your haproxy to the latest stable version in your branch 
> >(1.4.22) to get all known fixes, and check again. If nothing here 
> >helps, then a tcpdump on the siege host would help. Regards, Willy 
> 
> Okay, I'll do this. I simply installed the Debian wheezy package. As 
> there are so many resolved bugs, perhaps it's a good plan to get a 
> package built for wheezy before release?

Maybe, I don't know. I never understood debian's maintenance plans, I
know that maintainers have a hard time pushing fixes once the distro
is declared "stable" which in fact seems to meen "frozen with all known
bugs there forever" :-(

Now 1.4 is stable enough to be packaged with the risk that it will never
be upgraded, but clearly an early version should not go into a distro
which does not maintain fixes.

Regards,
Willy

Re: excessive socket failures

Reply via email to