503 on alive backends, hanging processes on reload

Jeff Mitchell Fri, 20 Mar 2015 15:12:39 -0700

I'm running haproxy 1.5.11-1ppa1~trusty from
https://launchpad.net/~vbernat/+archive/ubuntu/haproxy-1.5 on Trusty
(Ubuntu 14.04).


It is a fairly basic configuration that mostly comes straight from the defaults:

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

    # Default SSL material locations
    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private

    # Default ciphers to use on SSL-enabled listening sockets.
    # For more information, see ciphers(1SSL).
    ssl-default-bind-ciphers
kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL
    ssl-default-bind-options no-sslv3

defaults
    log global
    mode    http
    option  httplog
    option  dontlognull
    maxconn 1024
    timeout queue 5000
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend ft_poml_vip
    bind :80

    acl host_apibrowse hdr_beg(host) -i apibrowse
    use_backend be_apibrowse if host_apibrowse

backend be_apibrowse
    server registry 10.88.24.3:49163

I also have several more ACLs and backends that are not shown, but
follow the exact same pattern as above (with different host header
matching).

The main differences from the default are maxconn/timeout queue, both
of which I set to try to solve this problem, and my simple
frontend/backend.

After a time, calls from a web browser to haproxy are sometimes, but
not always, being given 503 errors. When I see this happening, if I
sit on a very simple page and refresh rapidly, I will sometimes get
503s and sometimes not. I turned off health checks to ensure that
failing health checks were not the source of the 503s.

What I have noticed is some oddness with the haproxy processes. Here
is "date" and "ps -ef" output when I am seeing this behavior:

Fri Mar 20 21:55:38 GMT 2015

haproxy  19621     1  0 17:35 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy  20075     1  0 20:50 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063
haproxy  20121     1  0 20:50 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20112

"service haproxy reload" has been called at various times when the
backends have come and gone and the config file has been rewritten,
including at 17:35 and 20:50.

When haproxy is in this state, "service haproxy stop" does not stop
all processes:

# service haproxy stop
 * Stopping haproxy haproxy

                    [ OK ]
# ps -ef | grep haproxy
haproxy  19621     1  0 17:35 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy  20075     1  0 20:50 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063

If I then start the service again, those same processes run, but with a new one:
# ps -ef | grep haproxy
haproxy  19621     1  0 17:35 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599
haproxy  20075     1  0 20:50 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063
haproxy  20395     1  0 22:04 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid

When I run "service haproxy stop" and then manually kill any remaining
processes, and then run "service haproxy start", I get just the one
process:
# ps -ef | grep haproxy
haproxy  20443     1  0 22:05 ?        00:00:00 /usr/sbin/haproxy -f
/etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid

At this point I do *not* get the 503 errors. Everything runs great
until the cycle repeats itself.

It feels like this is some issue with haproxy reloading. It is
possible that reload was called multiple times rapidly when being
performed by the automated system, but in my testing if I call it very
rapidly from the command line I haven't been able to replicate the
issue.

Any help would be much appreciated.

Thanks!
--Jeff

503 on alive backends, hanging processes on reload

Reply via email to