I have no problem increasing the RAM if needed, but how do I know if it's
needed? Where can I see the number of connections per second to see if I
somehow reached 20k ? I don't think I reached 20k because the global
maxconn is 20K....
This is my TCP tuning config for the LB:
# TCP stack tuning
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_max_syn_backlog = 10000
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3
net.core.somaxconn = 20000
# netfilter/iptables tuning
net.netfilter.nf_conntrack_max = 524288
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
# Allow binding to non-local IP addresses
net.ipv4.ip_nonlocal_bind = 0
This is my haproxy.cfg:
global
daemon
user haproxy
group proxy
log 127.0.0.1 local0
log-send-hostname
maxconn 20000
defaults
mode http
log global
retries 2
timeout client 90s # Client and server timeout must match the
longest.
timeout server 90s # Time we may wait for a response from the
server.
timeout queue 90s # Don't queue requests too long if saturated.
timeout connect 4s # There's no reason to change this one.
option abortonclose # Close aborted connections if they still didn't
reach a backend (e.g still in a queue).
option http-server-close # Enable HTTP connection closing on the server
(backend) side.
option log-health-checks
option tcp-smart-accept
option tcp-smart-connect
frontend public
bind :80
maxconn 19500
option httplog
# Add the backend server ID as a response header
rspadd X-Backend:\ 0 if { srv_id 1 }
rspadd X-Backend:\ 1 if { srv_id 2 }
# Use dynamic backend if the request path ends with .php, fallback to the
default static otherwise
acl url_dynamic path_end .php
use_backend dynamic if url_dynamic
default_backend static
backend dynamic
balance roundrobin
option forwardfor except 127.0.0.1 # Set the client IP in
X-Forwarded-For except for when the client IP is loopback (nginx SSL
termination).
option httpchk GET /dynamic_health_check
default-server inter 4000
server web-01 web-01:80 maxconn 80 check
server web-02 web-02:80 maxconn 80 check
backend static
balance roundrobin
option httpchk GET /static_health_check
server web-01 web-01:80 check
server web-02 web-02:80 check
# Enable the stats page on a dedicated port (8888)
listen stats
# Uncomment 'disabled' below to disable the stats page
# disabled
bind :8888
stats uri /
stats realm HAProxy\ Statistics
stats auth admin:my-secret-password
Any help would be much appreciated, we're experiencing issues with less
traffic than before haproxy...
Thanks,
Bar.
On Sat, May 12, 2012 at 2:31 PM, Willy Tarreau <[email protected]> wrote:
> On Sat, May 12, 2012 at 01:23:17PM +0200, Baptiste wrote:
> > On Sat, May 12, 2012 at 1:01 PM, Bar Ziony <[email protected]> wrote:
> > > Willy,
> > >
> > > Thank you, I will follow up with your suggestions soon.
> > >
> > > But I just had a production down-time with the haproxy machine:
> > > After posting something to our Facebook wall (it happened twice,
> yesterday
> > > and 3 days ago), which usually brings more traffic (but not more than
> we can
> > > usually handle (for example before haproxy was deployed), the haproxy
> > > machine got into swap, all the memory was taken (1GB) and the machine
> kept
> > > keepalived bouncing to the backup machine (I believe because it was so
> > > unresponsive).
> > >
> > > How can I check that further? Should I just increase the machine's RAM?
> > >
> > > Thanks,
> > > Bar.
> > >o
> >
> > Maybe you can share with us some sysctl (mainly the ones related to
> > TCP buffers), as well as your HAProxy configuration (hiding private
> > information)
> > Are there any other processes which may eat memory on the machine?
>
> tcp_mem is often quite sensible, you need to limit it if you don't have
> enough RAM. You can also reduce haproxy's buffers size. I run all machines
> at slightly less than 8kB which is more than enough and holds 5.5 TCP
> segments, limiting copies in the kernel:
>
> global
> tune.bufsize 8030
> tune.maxrewrite 1030
>
> The numbers I'm used to see with this settings are 1/3 of the RAM used
> by haproxy, 1/3 used by socket buffers in the kernel and the last 1/3
> for the rest of the system.
>
> With such numbers, you have each connection take around 17kB on haproxy,
> which theorically allows up to around 40k concurrent conns on a 1GB
> machine.
> Warning, it's very tricky to reach 40k conns per GB. Better stay safe and
> aim at 20k per GB.
>
> Regards,
> Willy
>
>