Re: haproxy bug: healthcheck not passing after port change when statefile is enabled

Sven Wiltink Tue, 03 Jul 2018 06:42:23 -0700

Hey Baptiste,


Thank you for looking into it.


The bug is triggered by running haproxy with the following config:


global
    maxconn 32000
    tune.maxrewrite 2048
    user haproxy
    group haproxy
    daemon
    chroot /var/lib/haproxy
    nbproc 1
    maxcompcpuusage 85
    spread-checks 0
    stats socket /var/run/haproxy.sock mode 600 level admin process 1 user 
haproxy group haproxy
    server-state-file test
    server-state-base /var/run/haproxy/state
    master-worker no-exit-on-failure

defaults
    load-server-state-from-file global
    log global
    timeout http-request 5s
    timeout connect      2s
    timeout client       300s
    timeout server       300s
    mode http
    option dontlog-normal
    option http-server-close
    option redispatch
    option log-health-checks

listen stats
    bind :1936
    bind-process 1
    mode http
    stats enable
    stats uri /
    stats admin if TRUE

listen banaan-443-ipv4
    bind :443
    mode tcp
    server banaan-vps 127.0.0.1:443 check inter 2000


- Then start haproxy (it will do healthchecks to port 443)
- change server banaan-vps 127.0.0.1:443 check inter 2000 to server banaan-vps 
127.0.0.1:80 check inter 2000
- save the state using /bin/sh -c "echo show servers state | /usr/bin/socat 
/var/run/haproxy.sock - > /var/run/haproxy/state/test" (this is normally done 
using the systemd file on reload, see initial mail)
- reload haproxy (it still does healthchecks to port 443 while port 80 was 
expected)

if you delete the statefile and reload haproxy it will start healthchecks for 
port 80 as expected

-Sven







________________________________
Van: Baptiste <[email protected]>
Verzonden: dinsdag 3 juli 2018 11:38:14
Aan: Sven Wiltink
CC: [email protected]
Onderwerp: Re: haproxy bug: healthcheck not passing after port change when 
statefile is enabled

Hi Sven,

Thanks a lot for your feedback!
I'll check how we could handle this use case with the state file.

Just to ensure I'm going to troubleshoot the right issue, could you please 
summarize how you trigger this issue in a few simple steps?
IE:
- conf v1, server port is X
- generate server state (where port is X)
- update conf to v2, where port is Y
reload HAProxy => X is applied, while you expect to get Y instead

Baptiste



On Mon, Jun 25, 2018 at 12:55 PM, Sven Wiltink 
<[email protected]<mailto:[email protected]>> wrote:

Hello,


So we've dug a little deeper and the issue seems to be caused by the port value 
in the statefile. When the target port of a server has changed between reloads 
the port specified in the state file is leading. When running tcpdump you can 
see the healthchecks are being performed for the old port. After stopping 
haproxy and removing the statefile the healthcheck is performed for the right 
port. When manually editing the statefile to a random port the healthchecks 
will be performed for that port instead of the one specified by the config.


The code responsible for this is line 
http://git.haproxy.org/?p=haproxy-1.8.git;a=blob;f=src/server.c;h=523289e3bda7ca6aa15575f1928f5298760cf582;hb=HEAD#l2931

from commit 
http://git.haproxy.org/?p=haproxy-1.8.git;a=commitdiff;h=3169471964fdc49963e63f68c1fd88686821a0c4.


A solution would be invalidating the state when the ports don't match.


-Sven



________________________________
Van: Sven Wiltink
Verzonden: dinsdag 12 juni 2018 17:01:18
Aan: [email protected]<mailto:[email protected]>
Onderwerp: haproxy bug: healthcheck not passing after port change when 
statefile is enabled

Hello,

There seems to be a bug in the loading of state files after a configuration 
change. When changing the destination port of a server the healthchecks never 
start passing if the state before the reload was down. This bug has been 
introduced after 1.7.9 as we cannot reproduce it on machines running that 
version of haproxy. You can use the following steps to reproduce the issue:

Start with a fresh debian 9 install
install socat
install haproxy 1.8.9 from backports

create a systemd file 
/etc/systemd/system/haproxy.service.d/60-haproxy-server_state.conf  with the 
following contents:
[Service]
ExecStartPre=/bin/mkdir -p /var/run/haproxy/state
ExecReload=
ExecReload=/usr/sbin/haproxy -f ${CONFIG} -c -q $EXTRAOPTS
ExecReload=/bin/sh -c "echo show servers state | /usr/bin/socat 
/var/run/haproxy.sock - > /var/run/haproxy/state/test"
ExecReload=/bin/kill -USR2 $MAINPID

create the following files:
/etc/haproxy/haproxy.cfg.disabled:
global
    maxconn 32000
    tune.maxrewrite 2048
    user haproxy
    group haproxy
    daemon
    chroot /var/lib/haproxy
    nbproc 1
    maxcompcpuusage 85
    spread-checks 0
    stats socket /var/run/haproxy.sock mode 600 level admin process 1 user 
haproxy group haproxy
    server-state-file test
    server-state-base /var/run/haproxy/state
    master-worker no-exit-on-failure

defaults
    load-server-state-from-file global
    log global
    timeout http-request 5s
    timeout connect      2s
    timeout client       300s
    timeout server       300s
    mode http
    option dontlog-normal
    option http-server-close
    option redispatch
    option log-health-checks

listen stats
    bind :1936
    bind-process 1
    mode http
    stats enable
    stats uri /
    stats admin if TRUE

/etc/haproxy/haproxy.cfg.different-port:
global
    maxconn 32000
    tune.maxrewrite 2048
    user haproxy
    group haproxy
    daemon
    chroot /var/lib/haproxy
    nbproc 1
    maxcompcpuusage 85
    spread-checks 0
    stats socket /var/run/haproxy.sock mode 600 level admin process 1 user 
haproxy group haproxy
    server-state-file test
    server-state-base /var/run/haproxy/state
    master-worker no-exit-on-failure

defaults
    load-server-state-from-file global
    log global
    timeout http-request 5s
    timeout connect      2s
    timeout client       300s
    timeout server       300s
    mode http
    option dontlog-normal
    option http-server-close
    option redispatch
    option log-health-checks

listen stats
    bind :1936
    bind-process 1
    mode http
    stats enable
    stats uri /
    stats admin if TRUE

listen banaan-443-ipv4
    bind :443
    mode tcp
    server banaan-vps 127.0.0.1:80<http://127.0.0.1:80> check inter 2000
listen banaan-80-ipv4
    bind :80
    mode tcp
    server banaan-vps 127.0.0.1:80<http://127.0.0.1:80> check inter 2000

/etc/haproxy/haproxy.cfg.same-port:
global
    maxconn 32000
    tune.maxrewrite 2048
    user haproxy
    group haproxy
    daemon
    chroot /var/lib/haproxy
    nbproc 1
    maxcompcpuusage 85
    spread-checks 0
    stats socket /var/run/haproxy.sock mode 600 level admin process 1 user 
haproxy group haproxy
    server-state-file test
    server-state-base /var/run/haproxy/state
    master-worker no-exit-on-failure

defaults
    load-server-state-from-file global
    log global
    timeout http-request 5s
    timeout connect      2s
    timeout client       300s
    timeout server       300s
    mode http
    option dontlog-normal
    option http-server-close
    option redispatch
    option log-health-checks

listen stats
    bind :1936
    bind-process 1
    mode http
    stats enable
    stats uri /
    stats admin if TRUE

listen banaan-443-ipv4
    bind :443
    mode tcp
    server banaan-vps 127.0.0.1:443<http://127.0.0.1:443> check inter 2000
listen banaan-80-ipv4
    bind :80
    mode tcp
    server banaan-vps 127.0.0.1:80<http://127.0.0.1:80> check inter 2000


start a netcat process to fake a webserver: nc -klp 80
cp haproxy.cfg.disabled to haproxy.cfg and start haproxy.
cp haproxy.cfg.same-port to haproxy.cfg and reload haproxy. You will now see 
that the servers for banaan-443-ipv4 are marked as down, as expected (nothing 
is running on port 443).
Now cp haproxy.cfg.different-port to haproxy.cfg and reload haproxy again. 
banaan-443-ipv4 will still be marked as down, although it uses the same 
healthcheck as the port 80 configuration: server banaan-vps 
127.0.0.1:80<http://127.0.0.1:80> check inter 2000

If we now stop haproxy and delete the statefile located at 
/var/run/haproxy/state/test and start haproxy again the server will be marked 
as up.

Thanks in advance,
Sven

Re: haproxy bug: healthcheck not passing after port change when statefile is enabled

Reply via email to