Re: haproxy bug: healthcheck not passing after port change when statefile is enabled

Baptiste Tue, 03 Jul 2018 02:38:59 -0700

Hi Sven,

Thanks a lot for your feedback!
I'll check how we could handle this use case with the state file.


Just to ensure I'm going to troubleshoot the right issue, could you please
summarize how you trigger this issue in a few simple steps?
IE:
- conf v1, server port is X
- generate server state (where port is X)
- update conf to v2, where port is Y
reload HAProxy => X is applied, while you expect to get Y instead

Baptiste



On Mon, Jun 25, 2018 at 12:55 PM, Sven Wiltink <[email protected]> wrote:

> Hello,
>
>
> So we've dug a little deeper and the issue seems to be caused by the port
> value in the statefile. When the target port of a server has changed
> between reloads the port specified in the state file is leading. When
> running tcpdump you can see the healthchecks are being performed for the
> old port. After stopping haproxy and removing the statefile the healthcheck
> is performed for the right port. When manually editing the statefile to a
> random port the healthchecks will be performed for that port instead of the
> one specified by the config.
>
>
> The code responsible for this is line http://git.haproxy.org/?p=
> haproxy-1.8.git;a=blob;f=src/server.c;h=523289e3bda7ca6aa15575f1928f52
> 98760cf582;hb=HEAD#l2931
>
> from commit http://git.haproxy.org/?p=haproxy-1.8.git;a=commitdiff;h=
> 3169471964fdc49963e63f68c1fd88686821a0c4.
>
>
> A solution would be invalidating the state when the ports don't match.
>
>
> -Sven
>
>
>
> ------------------------------
> *Van:* Sven Wiltink
> *Verzonden:* dinsdag 12 juni 2018 17:01:18
> *Aan:* [email protected]
> *Onderwerp:* haproxy bug: healthcheck not passing after port change when
> statefile is enabled
>
> Hello,
>
> There seems to be a bug in the loading of state files after a
> configuration change. When changing the destination port of a server the
> healthchecks never start passing if the state before the reload was down.
> This bug has been introduced after 1.7.9 as we cannot reproduce it on
> machines running that version of haproxy. You can use the following steps
> to reproduce the issue:
>
> Start with a fresh debian 9 install
> install socat
> install haproxy 1.8.9 from backports
>
> create a systemd file /etc/systemd/system/haproxy.
> service.d/60-haproxy-server_state.conf  with the following contents:
> [Service]
> ExecStartPre=/bin/mkdir -p /var/run/haproxy/state
> ExecReload=
> ExecReload=/usr/sbin/haproxy -f ${CONFIG} -c -q $EXTRAOPTS
> ExecReload=/bin/sh -c "echo show servers state | /usr/bin/socat
> /var/run/haproxy.sock - > /var/run/haproxy/state/test"
> ExecReload=/bin/kill -USR2 $MAINPID
>
> create the following files:
> /etc/haproxy/haproxy.cfg.disabled:
> global
>     maxconn 32000
>     tune.maxrewrite 2048
>     user haproxy
>     group haproxy
>     daemon
>     chroot /var/lib/haproxy
>     nbproc 1
>     maxcompcpuusage 85
>     spread-checks 0
>     stats socket /var/run/haproxy.sock mode 600 level admin process 1 user
> haproxy group haproxy
>     server-state-file test
>     server-state-base /var/run/haproxy/state
>     master-worker no-exit-on-failure
>
> defaults
>     load-server-state-from-file global
>     log global
>     timeout http-request 5s
>     timeout connect      2s
>     timeout client       300s
>     timeout server       300s
>     mode http
>     option dontlog-normal
>     option http-server-close
>     option redispatch
>     option log-health-checks
>
> listen stats
>     bind :1936
>     bind-process 1
>     mode http
>     stats enable
>     stats uri /
>     stats admin if TRUE
>
> /etc/haproxy/haproxy.cfg.different-port:
> global
>     maxconn 32000
>     tune.maxrewrite 2048
>     user haproxy
>     group haproxy
>     daemon
>     chroot /var/lib/haproxy
>     nbproc 1
>     maxcompcpuusage 85
>     spread-checks 0
>     stats socket /var/run/haproxy.sock mode 600 level admin process 1 user
> haproxy group haproxy
>     server-state-file test
>     server-state-base /var/run/haproxy/state
>     master-worker no-exit-on-failure
>
> defaults
>     load-server-state-from-file global
>     log global
>     timeout http-request 5s
>     timeout connect      2s
>     timeout client       300s
>     timeout server       300s
>     mode http
>     option dontlog-normal
>     option http-server-close
>     option redispatch
>     option log-health-checks
>
> listen stats
>     bind :1936
>     bind-process 1
>     mode http
>     stats enable
>     stats uri /
>     stats admin if TRUE
>
> listen banaan-443-ipv4
>     bind :443
>     mode tcp
>     server banaan-vps 127.0.0.1:80 check inter 2000
> listen banaan-80-ipv4
>     bind :80
>     mode tcp
>     server banaan-vps 127.0.0.1:80 check inter 2000
>
> /etc/haproxy/haproxy.cfg.same-port:
> global
>     maxconn 32000
>     tune.maxrewrite 2048
>     user haproxy
>     group haproxy
>     daemon
>     chroot /var/lib/haproxy
>     nbproc 1
>     maxcompcpuusage 85
>     spread-checks 0
>     stats socket /var/run/haproxy.sock mode 600 level admin process 1 user
> haproxy group haproxy
>     server-state-file test
>     server-state-base /var/run/haproxy/state
>     master-worker no-exit-on-failure
>
> defaults
>     load-server-state-from-file global
>     log global
>     timeout http-request 5s
>     timeout connect      2s
>     timeout client       300s
>     timeout server       300s
>     mode http
>     option dontlog-normal
>     option http-server-close
>     option redispatch
>     option log-health-checks
>
> listen stats
>     bind :1936
>     bind-process 1
>     mode http
>     stats enable
>     stats uri /
>     stats admin if TRUE
>
> listen banaan-443-ipv4
>     bind :443
>     mode tcp
>     server banaan-vps 127.0.0.1:443 check inter 2000
> listen banaan-80-ipv4
>     bind :80
>     mode tcp
>     server banaan-vps 127.0.0.1:80 check inter 2000
>
>
> start a netcat process to fake a webserver: nc -klp 80
> cp haproxy.cfg.disabled to haproxy.cfg and start haproxy.
> cp haproxy.cfg.same-port to haproxy.cfg and reload haproxy. You will now
> see that the servers for banaan-443-ipv4 are marked as down, as expected
> (nothing is running on port 443).
> Now cp haproxy.cfg.different-port to haproxy.cfg and reload haproxy again.
> banaan-443-ipv4 will still be marked as down, although it uses the same
> healthcheck as the port 80 configuration: server banaan-vps 127.0.0.1:80
> check inter 2000
>
> If we now stop haproxy and delete the statefile located at
> /var/run/haproxy/state/test and start haproxy again the server will be
> marked as up.
>
> Thanks in advance,
> Sven
>
>
>

Re: haproxy bug: healthcheck not passing after port change when statefile is enabled

Reply via email to