I've been using HAProxy to loadbalance my appservers for many months
without problems. Recently some traffic spikes have lead me to setting
the maxconn parameter to rate-limit connections to my backend servers.
It works great for several hours, and then it seems to stop accepting
any connections. It doesn't just reach the maxconn limit and keep
serving existing requests, it completely stops working until I restart
HAProxy. (I can leave it for an hour and not a single connection will
be active to my backend app servers -- I can't even access the HAProxy
admin page)
I've looked at my system resource graphs, and both loadavg and RAM
usage seem to be under control with no big spikes before the freeze.
Here are my settings and logs from one restart to freeze and the next restart.
# HA-Proxy version 1.4.16 2011/08/04
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log 127.0.0.1 local1 debug
#log loghost local0 info
maxconn 4096
#chroot /usr/share/haproxy
user haproxy
group haproxy
daemon
#debug
#quiet
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option forwardfor
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen platform-cache 0.0.0.0:80
mode http
maxconn 10000
option abortonclose
#balance uri
hash-type consistent
balance hdr(Host)
server varnish1 10.176.129.245 weight 20 maxconn 64 check
server varnish2 10.176.129.29 weight 40 maxconn 64 check
contimeout 60000
# HTTP response : 'HTTP/1.0 200 OK'
listen http_health_check 0.0.0.0:60001
mode health
option httpchk
listen stats_for_scout 127.0.0.1:8081
mode http
stats uri /stats
listen public_stats :8080
mode http
stats uri /stats
stats realm Haproxy\ Statistics
stats auth **************
Here are the logs from one restart to the next freeze and restart a
few hours laters:
Feb 5 08:57:15 localhost haproxy[31950]: Proxy platform-cache started.
Feb 5 08:57:15 localhost haproxy[31950]: Proxy http_health_check
started.Feb 5 08:57:15 localhost haproxy[31950]: Proxy
stats_for_scout started.
Feb 5 08:57:15 localhost haproxy[31950]: Proxy public_stats started.
Feb 5 09:43:47 localhost haproxy[31951]: Pausing proxy platform-cache.
Feb 5 09:43:47 localhost haproxy[31951]: Pausing proxy http_health_check.
Feb 5 09:43:47 localhost haproxy[31951]: Pausing proxy stats_for_scout.
Feb 5 09:43:47 localhost haproxy[31951]: Pausing proxy public_stats.
Feb 5 09:43:47 localhost haproxy[32746]: Proxy platform-cache started.
Feb 5 09:43:47 localhost haproxy[32746]: Proxy http_health_check started.
Feb 5 09:43:47 localhost haproxy[32746]: Proxy stats_for_scout started.
Feb 5 09:43:47 localhost haproxy[32746]: Proxy public_stats started.
Feb 5 09:43:47 localhost haproxy[31951]: Stopping proxy platform-cache in 0 ms.
Feb 5 09:43:47 localhost haproxy[31951]: Stopping proxy
http_health_check in 0 ms.
Feb 5 09:43:47 localhost haproxy[31951]: Stopping proxy
stats_for_scout in 0 ms.
Feb 5 09:43:47 localhost haproxy[31951]: Stopping proxy public_stats in 0 ms.
Feb 5 09:43:47 localhost haproxy[31951]: Proxy platform-cache stopped
(FE: 32540 conns, BE: 30334 conns).
Feb 5 09:43:47 localhost haproxy[31951]: Proxy http_health_check
stopped (FE: 0 conns, BE: 0 conns).
Feb 5 09:43:47 localhost haproxy[31951]: Proxy stats_for_scout
stopped (FE: 16 conns, BE: 0 conns).
Feb 5 09:43:47 localhost haproxy[31951]: Proxy public_stats stopped
(FE: 4 conns, BE: 2 conns).
Feb 5 17:52:27 localhost haproxy[26610]: Proxy platform-cache started.
Feb 5 17:52:27 localhost haproxy[26610]: Proxy http_health_check started.
Feb 5 17:52:27 localhost haproxy[26610]: Proxy stats_for_scout started.
Feb 5 17:52:27 localhost haproxy[26610]: Proxy public_stats started.
Thanks for your help!
Erik
PS I cross posted this on ServerFault, but haven't gotten any helpful
responses yet:
http://serverfault.com/questions/356979/haproxy-stops-accepting-connections