On Wed, Mar 27, 2019 at 10:23:44PM +0300, Максим Куприянов wrote:
> Hi! Thank you very much, I'll test your patch and will write back tomorrow.
> 

Willy made me realize that my patch was wrong, so here another one.

I still cannot reproduce your problem at all, do you reproduce easily?

The only case we can imagine, is something forking from the master which is not
a worker, maybe a fork in a lua script, or in a openssl engine.


-- 
William Lallemand
>From cf0a8557e758b10b75a2bb64b0992ce052a969af Mon Sep 17 00:00:00 2001
From: William Lallemand <[email protected]>
Date: Wed, 27 Mar 2019 14:19:10 +0100
Subject: [PATCH] BUG/MEDIUM: mworker: don't free the wrong child when not
 found

A bug occurs when the sigchld handler is called and a child which is
not in the process list just left, or with an empty process list.

The child variable won't be set and left as an uninitialized variable or
set to the wrong child entry, which can lead to a free of this
uninitialized variable or of the wrong child.

This can lead to a crash of the master during a stop or a reload.

It is not supposed to happen with a worker which was created by the
master. A cause could be a fork made by a dependency. (openssl, lua ?)

This patch strengthens the case of the missing child by doing the free
only if the child was found.

This patch must be backported to 1.9.
---
 src/haproxy.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index 646f040db..1cb103919 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -828,9 +828,12 @@ static void mworker_catch_sigchld(struct sig_handler *sh)
 	int exitpid = -1;
 	int status = 0;
 	struct mworker_proc *child, *it;
+	int childfound;
 
 restart_wait:
 
+	childfound = 0;
+
 	exitpid = waitpid(-1, &status, WNOHANG);
 	if (exitpid > 0) {
 		if (WIFEXITED(status))
@@ -848,10 +851,11 @@ restart_wait:
 
 			LIST_DEL(&child->list);
 			close(child->ipc_fd[0]);
+			childfound = 1;
 			break;
 		}
 
-		if (!children) {
+		if (!children || !childfound) {
 			ha_warning("Worker %d exited with code %d (%s)\n", exitpid, status, (status >= 128) ? strsignal(status - 128) : "Exit");
 		} else {
 			/* check if exited child was in the current children list */
@@ -868,8 +872,8 @@ restart_wait:
 				ha_warning("Former worker #%d (%d) exited with code %d (%s)\n", child->relative_pid, exitpid, status, (status >= 128) ? strsignal(status - 128) : "Exit");
 				delete_oldpid(exitpid);
 			}
+			free(child);
 		}
-		free(child);
 
 		/* do it again to check if it was the last worker */
 		goto restart_wait;
-- 
2.19.2

Reply via email to