Hi all,

I am running HAProxy 2.3.14 in a Kubernetes cluster managed by the 
haproxy-ingress ingress controller: 
https://github.com/jcmoraisjr/haproxy-ingress

There are sporadic connection resets and by capturing traffic I could identify 
the proxy to be the origin. The author of haproxy-ingress suggested that it 
might be a crash of the proxy and suggested to reach out here.

What I am seeing is regular traffic until the proxy suddenly sends a FIN/ACK to 
the server. The server is surprised by this and replies with a RST/ACK which 
the proxy forwards to the client. From the logs of the ingress controller, I 
could see that a reload was going on at that time and the connection was 20 
seconds old. In my setting, there are about 4800 backends and frequent changes 
to them that require reloads. At the same time, there are many long-living TCP 
connections incoming so old processes will not quickly terminate. At times, 
there are > 400 haproxy processes running and there are usually 200-300 
processes running at any given time.

I am wondering if there are known issues that could be the root of this or what 
I could do to prevent such resets. Any help would be appreciated.

More context: https://github.com/jcmoraisjr/haproxy-ingress/issues/899

Reload happens via:
haproxy -f "$PARAM_CFG" -p "$HAPROXY_PID" -D -sf $OLD_PID -x "$HAPROXY_SOCKET"

Global and default settings:
global
    daemon
    unix-bind mode 0600
    nbthread 63
    cpu-map auto:1/1-63 0-62
    stats socket /var/run/haproxy/admin.sock level admin expose-fd listeners 
mode 600
    maxconn 30100
    tune.ssl.default-dh-param 2048
    ssl-default-bind-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
    ssl-default-bind-ciphersuites 
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
    ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
    ssl-default-server-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
    ssl-default-server-ciphersuites 
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
    pp2-never-send-local

defaults
    log global
    maxconn 30100
    option redispatch
    option http-server-close
    option http-keep-alive
    timeout client          50s
    timeout client-fin      50s
    timeout connect         5s
    timeout http-keep-alive 1m
    timeout http-request    5s
    timeout queue           5s
    timeout server          50s
    timeout server-fin      50s
    timeout tunnel          24h

Thanks and best regards
Joerg

Reply via email to