I'm running haproxy 1.5.11-1ppa1~trusty from
https://launchpad.net/~vbernat/+archive/ubuntu/haproxy-1.5 on Trusty
(Ubuntu 14.04).
It is a fairly basic configuration that mostly comes straight from the defaults:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL).
ssl-default-bind-ciphers
kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
maxconn 1024
timeout queue 5000
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend ft_poml_vip
bind :80
acl host_apibrowse hdr_beg(host) -i apibrowse
use_backend be_apibrowse if host_apibrowse
backend be_apibrowse
server registry 10.88.24.3:49163
I also have several more ACLs and backends that are not shown, but
follow the exact same pattern as above (with different host header
matching).
The main differences from the default are maxconn/timeout queue, both
of which I set to try to solve this problem, and my simple
frontend/backend.
After a time, calls from a web browser to haproxy are sometimes, but
not always, being given 503 errors. When I see this happening, if I
sit on a very simple page and refresh rapidly, I will sometimes get
503s and sometimes not. I turned off health checks to ensure that
failing health checks were not the source of the 503s.
What I have noticed is some oddness with the haproxy processes. Here
is "date" and "ps -ef" output when I am seeing this behavior:
Fri Mar 20 21:55:38 GMT 2015
haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063
haproxy 20121 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20112
"service haproxy reload" has been called at various times when the
backends have come and gone and the config file has been rewritten,
including at 17:35 and 20:50.
When haproxy is in this state, "service haproxy stop" does not stop
all processes:
# service haproxy stop
* Stopping haproxy haproxy
[ OK ]
# ps -ef | grep haproxy
haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063
If I then start the service again, those same processes run, but with a new one:
# ps -ef | grep haproxy
haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063
haproxy 20395 1 0 22:04 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
When I run "service haproxy stop" and then manually kill any remaining
processes, and then run "service haproxy start", I get just the one
process:
# ps -ef | grep haproxy
haproxy 20443 1 0 22:05 ? 00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
At this point I do *not* get the 503 errors. Everything runs great
until the cycle repeats itself.
It feels like this is some issue with haproxy reloading. It is
possible that reload was called multiple times rapidly when being
performed by the automated system, but in my testing if I call it very
rapidly from the command line I haven't been able to replicate the
issue.
Any help would be much appreciated.
Thanks!
--Jeff