Hi, After debriefing internally, the fix will be much longer and may even trigger a new server-state file format. I keep you updated.
Baptiste On Sun, Nov 4, 2018 at 7:11 PM Baptiste <[email protected]> wrote: > Hi Sven, > > I reviewed the whole thing and I think the support of port in state file > was added for SRV records, but also for the runtime api, which allows > changing the port at runtime too. > I'll come back to you shortly with a fix for this behavior, currently > discussing with Willy/Fred about it. > (it's more complicated than moving the code > """ > if (port_str) > srv->svc_port = port; > """ > a couple of lines above). > > Baptiste > > > On Tue, Oct 9, 2018 at 10:52 AM Sven Wiltink <[email protected]> wrote: > >> Hey Baptiste, >> >> >> We noticed the SRV patch has been merged. That should mean that we can >> now fix this issue as well. Would you be able to fix this or should we >> >> try to provide a patch? >> >> >> Thanks again in advance, >> >> Sven >> ------------------------------ >> *Van:* Baptiste <[email protected]> >> *Verzonden:* donderdag 12 juli 2018 14:52:24 >> *Aan:* Sven Wiltink >> *CC:* [email protected] >> *Onderwerp:* Re: haproxy bug: healthcheck not passing after port change >> when statefile is enabled >> >> Hi Sven, >> >> Thanks for the clarification. >> It's a bit more complicated than what it is supposed to be. >> I think we may want to apply the port only if it has been changed at >> runtime (changed by DNS SRV records). >> >> The status is the following: I have a pending patch which brings SRV >> record information into the state file. (WIP, but last mile) >> Once it has been merged, we'll be able to fix this issue (by applying the >> port only when the server is being managed by an SRV record). >> >> Baptiste >> >> >> On Tue, Jul 3, 2018 at 3:41 PM, Sven Wiltink <[email protected]> wrote: >> >> Hey Baptiste, >> >> >> Thank you for looking into it. >> >> >> The bug is triggered by running haproxy with the following config: >> >> >> global >> maxconn 32000 >> tune.maxrewrite 2048 >> user haproxy >> group haproxy >> daemon >> chroot /var/lib/haproxy >> nbproc 1 >> maxcompcpuusage 85 >> spread-checks 0 >> stats socket /var/run/haproxy.sock mode 600 level admin process 1 >> user haproxy group haproxy >> server-state-file test >> server-state-base /var/run/haproxy/state >> master-worker no-exit-on-failure >> >> defaults >> load-server-state-from-file global >> log global >> timeout http-request 5s >> timeout connect 2s >> timeout client 300s >> timeout server 300s >> mode http >> option dontlog-normal >> option http-server-close >> option redispatch >> option log-health-checks >> >> listen stats >> bind :1936 >> bind-process 1 >> mode http >> stats enable >> stats uri / >> stats admin if TRUE >> >> listen banaan-443-ipv4 >> bind :443 >> mode tcp >> server banaan-vps 127.0.0.1:443 check inter 2000 >> >> >> - Then start haproxy (it will do healthchecks to port 443) >> - change server banaan-vps 127.0.0.1:443 check inter 2000 to server >> banaan-vps 127.0.0.1:80 check inter 2000 >> - save the state using /bin/sh -c "echo show servers state | >> /usr/bin/socat /var/run/haproxy.sock - > /var/run/haproxy/state/test" >> (this is normally done using the systemd file on reload, see initial mail) >> - reload haproxy (it still does healthchecks to port 443 while port 80 >> was expected) >> >> if you delete the statefile and reload haproxy it will start healthchecks >> for port 80 as expected >> >> -Sven >> >> >> >> >> >> >> ------------------------------ >> *Van:* Baptiste <[email protected]> >> *Verzonden:* dinsdag 3 juli 2018 11:38:14 >> *Aan:* Sven Wiltink >> *CC:* [email protected] >> *Onderwerp:* Re: haproxy bug: healthcheck not passing after port change >> when statefile is enabled >> >> Hi Sven, >> >> Thanks a lot for your feedback! >> I'll check how we could handle this use case with the state file. >> >> Just to ensure I'm going to troubleshoot the right issue, could you >> please summarize how you trigger this issue in a few simple steps? >> IE: >> - conf v1, server port is X >> - generate server state (where port is X) >> - update conf to v2, where port is Y >> reload HAProxy => X is applied, while you expect to get Y instead >> >> Baptiste >> >> >> >> On Mon, Jun 25, 2018 at 12:55 PM, Sven Wiltink <[email protected]> >> wrote: >> >> Hello, >> >> >> So we've dug a little deeper and the issue seems to be caused by the port >> value in the statefile. When the target port of a server has changed >> between reloads the port specified in the state file is leading. When >> running tcpdump you can see the healthchecks are being performed for the >> old port. After stopping haproxy and removing the statefile the healthcheck >> is performed for the right port. When manually editing the statefile to a >> random port the healthchecks will be performed for that port instead of the >> one specified by the config. >> >> >> The code responsible for this is line >> http://git.haproxy.org/?p=haproxy-1.8.git;a=blob;f=src/server.c;h=523289e3bda7ca6aa15575f1928f5298760cf582;hb=HEAD#l2931 >> >> from commit >> http://git.haproxy.org/?p=haproxy-1.8.git;a=commitdiff;h=3169471964fdc49963e63f68c1fd88686821a0c4 >> . >> >> >> A solution would be invalidating the state when the ports don't match. >> >> >> -Sven >> >> >> >> ------------------------------ >> *Van:* Sven Wiltink >> *Verzonden:* dinsdag 12 juni 2018 17:01:18 >> *Aan:* [email protected] >> *Onderwerp:* haproxy bug: healthcheck not passing after port change when >> statefile is enabled >> >> Hello, >> >> There seems to be a bug in the loading of state files after a >> configuration change. When changing the destination port of a server the >> healthchecks never start passing if the state before the reload was down. >> This bug has been introduced after 1.7.9 as we cannot reproduce it on >> machines running that version of haproxy. You can use the following steps >> to reproduce the issue: >> >> Start with a fresh debian 9 install >> install socat >> install haproxy 1.8.9 from backports >> >> create a systemd file >> /etc/systemd/system/haproxy.service.d/60-haproxy-server_state.conf >> with the following contents: >> [Service] >> ExecStartPre=/bin/mkdir -p /var/run/haproxy/state >> ExecReload= >> ExecReload=/usr/sbin/haproxy -f ${CONFIG} -c -q $EXTRAOPTS >> ExecReload=/bin/sh -c "echo show servers state | /usr/bin/socat >> /var/run/haproxy.sock - > /var/run/haproxy/state/test" >> ExecReload=/bin/kill -USR2 $MAINPID >> >> create the following files: >> /etc/haproxy/haproxy.cfg.disabled: >> global >> maxconn 32000 >> tune.maxrewrite 2048 >> user haproxy >> group haproxy >> daemon >> chroot /var/lib/haproxy >> nbproc 1 >> maxcompcpuusage 85 >> spread-checks 0 >> stats socket /var/run/haproxy.sock mode 600 level admin process 1 >> user haproxy group haproxy >> server-state-file test >> server-state-base /var/run/haproxy/state >> master-worker no-exit-on-failure >> >> defaults >> load-server-state-from-file global >> log global >> timeout http-request 5s >> timeout connect 2s >> timeout client 300s >> timeout server 300s >> mode http >> option dontlog-normal >> option http-server-close >> option redispatch >> option log-health-checks >> >> listen stats >> bind :1936 >> bind-process 1 >> mode http >> stats enable >> stats uri / >> stats admin if TRUE >> >> /etc/haproxy/haproxy.cfg.different-port: >> global >> maxconn 32000 >> tune.maxrewrite 2048 >> user haproxy >> group haproxy >> daemon >> chroot /var/lib/haproxy >> nbproc 1 >> maxcompcpuusage 85 >> spread-checks 0 >> stats socket /var/run/haproxy.sock mode 600 level admin process 1 >> user haproxy group haproxy >> server-state-file test >> server-state-base /var/run/haproxy/state >> master-worker no-exit-on-failure >> >> defaults >> load-server-state-from-file global >> log global >> timeout http-request 5s >> timeout connect 2s >> timeout client 300s >> timeout server 300s >> mode http >> option dontlog-normal >> option http-server-close >> option redispatch >> option log-health-checks >> >> listen stats >> bind :1936 >> bind-process 1 >> mode http >> stats enable >> stats uri / >> stats admin if TRUE >> >> listen banaan-443-ipv4 >> bind :443 >> mode tcp >> server banaan-vps 127.0.0.1:80 check inter 2000 >> listen banaan-80-ipv4 >> bind :80 >> mode tcp >> server banaan-vps 127.0.0.1:80 check inter 2000 >> >> /etc/haproxy/haproxy.cfg.same-port: >> global >> maxconn 32000 >> tune.maxrewrite 2048 >> user haproxy >> group haproxy >> daemon >> chroot /var/lib/haproxy >> nbproc 1 >> maxcompcpuusage 85 >> spread-checks 0 >> stats socket /var/run/haproxy.sock mode 600 level admin process 1 >> user haproxy group haproxy >> server-state-file test >> server-state-base /var/run/haproxy/state >> master-worker no-exit-on-failure >> >> defaults >> load-server-state-from-file global >> log global >> timeout http-request 5s >> timeout connect 2s >> timeout client 300s >> timeout server 300s >> mode http >> option dontlog-normal >> option http-server-close >> option redispatch >> option log-health-checks >> >> listen stats >> bind :1936 >> bind-process 1 >> mode http >> stats enable >> stats uri / >> stats admin if TRUE >> >> listen banaan-443-ipv4 >> bind :443 >> mode tcp >> server banaan-vps 127.0.0.1:443 check inter 2000 >> listen banaan-80-ipv4 >> bind :80 >> mode tcp >> server banaan-vps 127.0.0.1:80 check inter 2000 >> >> >> start a netcat process to fake a webserver: nc -klp 80 >> cp haproxy.cfg.disabled to haproxy.cfg and start haproxy. >> cp haproxy.cfg.same-port to haproxy.cfg and reload haproxy. You will now >> see that the servers for banaan-443-ipv4 are marked as down, as expected >> (nothing is running on port 443). >> Now cp haproxy.cfg.different-port to haproxy.cfg and reload haproxy >> again. banaan-443-ipv4 will still be marked as down, although it uses the >> same healthcheck as the port 80 configuration: server banaan-vps >> 127.0.0.1:80 check inter 2000 >> >> If we now stop haproxy and delete the statefile located at >> /var/run/haproxy/state/test and start haproxy again the server will be >> marked as up. >> >> Thanks in advance, >> Sven >> >> >> >> >>

