Thank you for support, we have fixed our issues. Ha. ----- Original Message -----
From: "Jeff Mitchell" <[email protected]> To: [email protected] Sent: Friday, March 20, 2015 6:11:14 PM Subject: 503 on alive backends, hanging processes on reload I'm running haproxy 1.5.11-1ppa1~trusty from https://launchpad.net/~vbernat/+archive/ubuntu/haproxy-1.5 on Trusty (Ubuntu 14.04). It is a fairly basic configuration that mostly comes straight from the defaults: global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL ssl-default-bind-options no-sslv3 defaults log global mode http option httplog option dontlognull maxconn 1024 timeout queue 5000 timeout connect 5000 timeout client 50000 timeout server 50000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http frontend ft_poml_vip bind :80 acl host_apibrowse hdr_beg(host) -i apibrowse use_backend be_apibrowse if host_apibrowse backend be_apibrowse server registry 10.88.24.3:49163 I also have several more ACLs and backends that are not shown, but follow the exact same pattern as above (with different host header matching). The main differences from the default are maxconn/timeout queue, both of which I set to try to solve this problem, and my simple frontend/backend. After a time, calls from a web browser to haproxy are sometimes, but not always, being given 503 errors. When I see this happening, if I sit on a very simple page and refresh rapidly, I will sometimes get 503s and sometimes not. I turned off health checks to ensure that failing health checks were not the source of the 503s. What I have noticed is some oddness with the haproxy processes. Here is "date" and "ps -ef" output when I am seeing this behavior: Fri Mar 20 21:55:38 GMT 2015 haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599 haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063 haproxy 20121 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20112 "service haproxy reload" has been called at various times when the backends have come and gone and the config file has been rewritten, including at 17:35 and 20:50. When haproxy is in this state, "service haproxy stop" does not stop all processes: # service haproxy stop * Stopping haproxy haproxy [ OK ] # ps -ef | grep haproxy haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599 haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063 If I then start the service again, those same processes run, but with a new one: # ps -ef | grep haproxy haproxy 19621 1 0 17:35 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 19599 haproxy 20075 1 0 20:50 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf 20063 haproxy 20395 1 0 22:04 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid When I run "service haproxy stop" and then manually kill any remaining processes, and then run "service haproxy start", I get just the one process: # ps -ef | grep haproxy haproxy 20443 1 0 22:05 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid At this point I do *not* get the 503 errors. Everything runs great until the cycle repeats itself. It feels like this is some issue with haproxy reloading. It is possible that reload was called multiple times rapidly when being performed by the automated system, but in my testing if I call it very rapidly from the command line I haven't been able to replicate the issue. Any help would be much appreciated. Thanks! --Jeff

