On 6 Feb 2018 4:38 am, "Kai Timmer" <k...@staffbase.com> wrote:
Hello, I recently tried to update from v1.6.14 to v1.8.3 but experienced a lot of problems with it. I do hope that I made mistake in my configuration that works in 1.6 but blows up my system up in 1.8. So I'm going to describe my setup/workload and hope that someone here might be able to help me. I run 2 haproxy servers in front of multiple Jetty based backend systems. The main source of data for those backend servers is a mongodb. Both of those haproxy servers receive traffic and use keepalived for failover scenarios. Before I updated to v1.8 the average time to connect was about 0.5ms. After the update it kept like that when there was almost no traffic (night times). But as soon as I got some traffic. The time to connect went up to 500-600ms. The problem occurred when we had around 2000 requests per minute. Restarting the haproxy process helped for a while. But after a few minutes the problem was back again. One other thing I like to share is the fact that I see a peak in MongoDB write lock times when updating haproxy. Probably some side effect of something else going on but still, it correlates with the new haproxy version. As soon as I downgraded back to v1.6.14 the problem went away. This is the haproxy.cfg that I use (same config for 1.6.14 and 1.8.3): ########################################################################## global log 127.0.0.1 local1 info chroot /var/lib/haproxy pidfile /var/run/haproxy.pid ulimit-n 65536 user haproxy group haproxy daemon ca-base /etc/ssl/certs crt-base /etc/ssl/private tune.ssl.default-dh-param 2048 ssl-default-bind-options no-sslv3 no-tls-tickets ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256: ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384: ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE- DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128- SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA: ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA- AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256- SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128- SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA- AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128- SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:! DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:! aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA stats socket /var/run/haproxy.sock level admin stats timeout 30s maxconn 22000 defaults mode http log global option dontlognull option httplog option log-health-checks timeout connect 3600000ms timeout client 3600000ms timeout server 3600000ms timeout check 10s balance roundrobin maxconn 20000 fullconn 2000 resolvers ourdns nameserver dns1 10.13.138.250:53 nameserver dns2 10.13.138.251:53 resolve_retries 3 timeout retry 1s hold valid 30s #--------------------------------------------------------------------- # frontend HTTP #--------------------------------------------------------------------- frontend http-in option http-server-close bind *:80 redirect scheme https code 301 if !{ ssl_fc } #--------------------------------------------------------------------- # frontend HTTPS #--------------------------------------------------------------------- frontend https-in option forwardfor http-request set-header X-Forwarded-Port %[dst_port] http-request set-header X-Forwarded-Proto https rspadd Strict-Transport-Security:\ max-age=31536000;\ includeSubdomains;\ preload bind *:443 ssl no-sslv3 crt /etc/ssl/ourssl/ourcert.com.pem crt /etc/ssl/oldcert/oldcert.net.pem acl host_backend hdr_dom(host) -i de backend use_backend our.backend if host_backend # deny all requests to /metrics acl restricted_metrics path_beg,url_dec -i /metrics http-request deny if restricted_metrics default_backend eyo.backend rspidel ^Server:.*$ rspidel ^X-Powered-By:.*$ #--------------------------------------------------------------------- # backend #--------------------------------------------------------------------- backend our.backend description Backend (Java, Port 8080) option http-server-close option http-pretend-keepalive timeout server 2700000ms option httpchk GET /health HTTP/1.0\r\nAuthorization:\ Basic\ foobar http-check expect status 200 default-server inter 5s fall 2 rise 2 server backend01 backend01.prod:8080 resolvers ourdns check slowstart 60s # regular server server backend02 backend02.prod:8080 resolvers ourdns check slowstart 60s # regular server server backend03 backend03.prod:8080 resolvers ourdns check slowstart 60s # regular server server backend04 backend04.prod:8080 resolvers ourdns check slowstart 60s # regular server server backend05 backend05.prod:8080 resolvers ourdns check slowstart 60s # regular server server backend06 backend06.prod:8080 resolvers ourdns check slowstart 60s # regular server ########################################################################## Any help on what might be going on here is highly appreciated. Regards, Kai Maybe the logs would be useful here from the time of the "incident".