Re: Help determining where the bottleneck is

Steve V Thu, 02 Feb 2012 08:25:36 -0800

Thanks for the response.

The stats were lagging actually, we determined that the bottleneck was
before HAproxy (it ended up being the IPS in front of the network)


However, our linux guy suggested the following sysctl changes to enhance
throughput which i will share here:

net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65023
net.ipv4.tcp_max_syn_backlog = 100000
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_synack_retries = 2
net.core.somaxconn = 60000
net.core.netdev_max_backlog = 10000

On Sun, Jan 29, 2012 at 5:26 AM, Willy Tarreau <[email protected]> wrote:

> Hi Steve,
>
> On Tue, Jan 24, 2012 at 08:55:15AM -0800, Steve V wrote:
> > Good morning,
> >
> > Much love for haproxy and many thanks to all who have worked on and
> > contributed to it.  We have been using it for several years without
> issue.
> > However, we have been doing load testing lately and there appears to be a
> > bottleneck.  It may not even have to do with haproxy (i dont think it
> does)
> > but i need to double check anyways just to be thorough and cover all our
> > bases.
> >
> > Hardware: VM running on ESXi, it has 2gigs RAM allocated to it, and 2
> CPU's
> > GuestOS: CentOS 5
> > Haproxy version: 1.4.8 (however, we just upgraded to 1.4.19 last night)
> >
> > Problem: "second_proxy" is getting hammered by a load test, site
> > performance decreases to the point where the site is barely usable and
> the
> > majority of pages time out.  however, go to a different site that is in
> the
> > same haproxy config listening on "http_proxy" going to the same backend
> > server, and the site comes up fine and fast.  it seems like something is
> > being throttled or queued somewhere.  its possible that it could be an
> > issue behind haproxy on the app servers, but i just want to make sure
> there
> > is nothing i need to tweak in my config.
> >
> > Here is a snapshot of the haproxy stats page for the slow pool
> > "second_proxy" http://tinypic.com/r/15887qf/5
>
> Did you tune any sysctl on your system ?
> Your snapshot reports a peak of 1600 conns/second, but the default kernel
> settings (somaxconn 128 and tcp_max_syn_backlog 1024) make this hard to
> reach, so it's very possible that the socket queue is simply full. I'm
> used to set both between 10000 and 20000 with good success.
>
> There is something you can try to detect if haproxy still accepts
> connections
> fine : simply try to connect to the stats URL on the unresponding port. If
> the
> stats display properly, then you're stuck on the servers. If the stats do
> not
> respond either, then the connection is not accepted.
>
> Be careful, you have no "maxconn" setting in the "defaults" section, and by
> default a listen uses 2000. I'm seeing that your snapshot indicates that
> this
> limit was not reached, still I wanted to let you know it's going to be the
> next issue once this one is resolved.
>
> > here is my haproxy.cfg
> >
> > global
> >         maxconn     8096
> >         daemon
> >         nbproc      1
> >         stats socket /var/run/haproxy.stat
> >
> > defaults
> >         clitimeout  600000
> >         srvtimeout  600000
>
> Do you realize that this is 10 minutes (we're speaking HTTP here) ?
>
> Regards,
> Willy
>
>

Re: Help determining where the bottleneck is

Reply via email to