On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote:
> It's almost the same except some servers with weight=0 in conf, script
> to disable/enable servers works fine with haproxy-ss-20131031, but
> with  haproxy-ss-20131122 or newer, it will crash haproxy after
> several disable/enable commands via socket. I can mail you the conf if
> need.

OK thank you for your config, Igor. It definitely helped. The issue was
related to the "track" servers with the recent changes. It caused an
(almost) infinite recursion. It was easier to see here because I have
no server responding to your addresses, so the process crashes as soon
as the first server fails the first check.

I'm attaching the fix to apply on top of your snapshot, that I'm going
to merge into mainline.

Simon, I suggest you apply the attached patch on top of your development
branch in case you still work on the lb agent checks, because then you're
affected as well.

Thanks,
Willy

>From 151cd51485f8ebe47aa3b2f2eedb604d4c2cfd0d Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Wed, 27 Nov 2013 16:52:23 +0100
Subject: BUG/MAJOR: fix haproxy crash when using server tracking instead of
 checks

Igor at owind reported a very recent bug (just present in latest snapshot).
Commit "4a741432 MEDIUM: Paramatise functions over the check of a server"
causes up/down to die with tracked servers due to a typo.

The following call in set_server_down causes the server to put itself
down recurseively because "check" is the current server's check, so once
fed to the function again, it will pass through the exact same path (note
we have the exact symmetry in set_server_up) :

        for (srv = s->tracknext; srv; srv = srv->tracknext)
                if (!(srv->state & SRV_MAINTAIN))
                        /* Only notify tracking servers that are not already in 
maintenance. */
                        set_server_down(check);

Instead we should stop the tracking server being visited in the loop :

        for (srv = s->tracknext; srv; srv = srv->tracknext)
                if (!(srv->state & SRV_MAINTAIN))
                        /* Only notify tracking servers that are not already in 
maintenance. */
                        set_server_down(&srv->check);

But that's not exactly enough because srv->check->server is only set when
checks are enabled, so ->server is NULL for tracking servers, still causing a
crash upon first iteration. The fix is easy and consists in always initializing
check->server when creating a new server, which is what was already done a few
patches later by 69d29f9 (MEDIUM: cfgparse: Factor out check initialisation).

With the fix above alone on top of current version or snapshot 20131122, the
problem disappears.

Thanks to Igor for testing and reporting the issue.
---
 src/checks.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/checks.c b/src/checks.c
index cfbb8a3..a722c12 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -445,10 +445,10 @@ void set_server_down(struct check *check)
                s->counters.down_trans++;
 
                if (s->state & SRV_CHECKED)
-                       for(srv = s->tracknext; srv; srv = srv->tracknext)
-                               if (! (srv->state & SRV_MAINTAIN))
+                       for (srv = s->tracknext; srv; srv = srv->tracknext)
+                               if (!(srv->state & SRV_MAINTAIN))
                                        /* Only notify tracking servers that 
are not already in maintenance. */
-                                       set_server_down(check);
+                                       set_server_down(&srv->check);
        }
 
        check->health = 0; /* failure */
@@ -520,10 +520,10 @@ void set_server_up(struct check *check) {
                send_log(s->proxy, LOG_NOTICE, "%s.\n", trash.str);
 
                if (s->state & SRV_CHECKED)
-                       for(srv = s->tracknext; srv; srv = srv->tracknext)
-                               if (! (srv->state & SRV_MAINTAIN))
+                       for (srv = s->tracknext; srv; srv = srv->tracknext)
+                               if (!(srv->state & SRV_MAINTAIN))
                                        /* Only notify tracking servers if 
they're not in maintenance. */
-                                       set_server_up(check);
+                                       set_server_up(&srv->check);
        }
 
        if (check->health >= s->rise)
-- 
1.7.12.1

Reply via email to