On 6 Feb 2018 4:38 am, "Kai Timmer" <k...@staffbase.com> wrote:

Hello,
I recently tried to update from v1.6.14 to v1.8.3 but experienced a lot of
problems with it.

I do hope that I made mistake in my configuration that works in 1.6 but
blows up my system up in 1.8. So I'm going to describe my setup/workload
and hope that someone here might be able to help me.

I run 2 haproxy servers in front of multiple Jetty based backend systems.
The main source of data for those backend servers is a mongodb. Both of
those haproxy servers receive traffic and use keepalived for failover
scenarios.

Before I updated to v1.8 the average time to connect was about 0.5ms. After
the update it kept like that when there was almost no traffic (night
times). But as soon as I got some traffic. The time to connect went up to
500-600ms.

The problem occurred when we had around 2000 requests per minute.

Restarting the haproxy process helped for a while. But after a few minutes
the problem was back again.

One other thing I like to share is the fact that I see a peak in MongoDB
write lock times when updating haproxy. Probably some side effect of
something else going on but still, it correlates with the new haproxy
version.

As soon as I downgraded back to v1.6.14 the problem went away.

This is the haproxy.cfg that I use (same config for 1.6.14 and 1.8.3):
##########################################################################
global
    log 127.0.0.1 local1 info
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    ulimit-n    65536
    user  haproxy
    group haproxy
    daemon
    ca-base  /etc/ssl/certs
    crt-base /etc/ssl/private
    tune.ssl.default-dh-param 2048
    ssl-default-bind-options no-sslv3 no-tls-tickets
    ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:
ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-
DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-
SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:
ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-
AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-
SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-
SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-
AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-
SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:!
DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!
aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA
    stats socket /var/run/haproxy.sock level admin
    stats timeout 30s
    maxconn 22000

defaults
    mode   http
    log    global
    option dontlognull
    option httplog
    option log-health-checks
    timeout connect 3600000ms
    timeout client 3600000ms
    timeout server 3600000ms
    timeout check 10s
    balance roundrobin
    maxconn 20000
    fullconn 2000

resolvers ourdns
   nameserver dns1    10.13.138.250:53
   nameserver dns2    10.13.138.251:53
   resolve_retries    3
   timeout retry      1s
   hold valid         30s

#---------------------------------------------------------------------
# frontend HTTP
#---------------------------------------------------------------------
frontend http-in
    option http-server-close
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

#---------------------------------------------------------------------
# frontend HTTPS
#---------------------------------------------------------------------
frontend https-in
    option forwardfor
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request set-header X-Forwarded-Proto https
    rspadd Strict-Transport-Security:\ max-age=31536000;\
includeSubdomains;\ preload
    bind *:443 ssl no-sslv3 crt /etc/ssl/ourssl/ourcert.com.pem crt
/etc/ssl/oldcert/oldcert.net.pem

    acl host_backend      hdr_dom(host) -i de backend
    use_backend our.backend      if host_backend

    # deny all requests to /metrics
    acl restricted_metrics path_beg,url_dec -i /metrics
    http-request deny if restricted_metrics

    default_backend eyo.backend
    rspidel ^Server:.*$
    rspidel ^X-Powered-By:.*$


#---------------------------------------------------------------------
# backend
#---------------------------------------------------------------------
backend our.backend
    description Backend (Java, Port 8080)
    option http-server-close
    option http-pretend-keepalive
    timeout server 2700000ms

    option httpchk GET /health HTTP/1.0\r\nAuthorization:\ Basic\ foobar
    http-check expect status 200
    default-server inter 5s fall 2 rise 2
    server backend01 backend01.prod:8080 resolvers ourdns check slowstart
60s # regular server
    server backend02 backend02.prod:8080 resolvers ourdns check slowstart
60s # regular server
    server backend03 backend03.prod:8080 resolvers ourdns check slowstart
60s # regular server
    server backend04 backend04.prod:8080 resolvers ourdns check slowstart
60s # regular server
    server backend05 backend05.prod:8080 resolvers ourdns check slowstart
60s # regular server
    server backend06 backend06.prod:8080 resolvers ourdns check slowstart
60s # regular server

##########################################################################

Any help on what might be going on here is highly appreciated.

Regards,
Kai


Maybe the logs would be useful here from the time of the "incident".

Reply via email to