Hi everyone,

I noticed a very bad and strange behavior on linux with haproxy
1.5-dev26 causing unhandled connections (the connection is established
but not served).
I digged it down to a mix of systemd daemon mode and SO_REUSEPORT.

Note that, also if the description talks about child processes it's
not related to nbproc > 1 and I'm just using nbproc = 1

Using the systemd daemon mode the parent doesn't exits but waits for
his childs without closing its listening sockets.

As linux 3.9 introduced a SO_REUSEPORT option (always enabled in
haproxy if available) this will give unhandled connections problems
after an haproxy reload with open connections.

The problem is that when on reload a new parent is started (-Ds
$oldpid), in haproxy.c main there's a call to start_proxies that,
without SO_REUSEPORT, should fail (as the old processes are already
listening) and so a SIGTOU is sent to old processes. On this signal
the old childs will call (in pause_listener) a shutdown() on the
listening fd. From my tests (if I understand it correctly) this
affects the in kernel file (so the listen is really disabled for all
the processes, also the parent).

Instead, with SO_REUSEPORT, the call to start_proxies doesn't fail and
so SIGTOU is never sent. Only SIGUSR1 is sent and the listen isn't
disabled for the parent but only the childs will stop listening (with
a call to close())

So, with SO_REUSEPORT, the old childs will close their listening
sockets but will wait for the current connections to finish or
timeout, and, as their parent has its listening socket open, the
kernel will schedule some connections on it. These connections will
never be accepted by the parent as it's in the waitpid loop.


This is easily reproducible just starting haproxy with a simple
listener (using systemctl) or just by hand using the "-Ds" option (I
used tcp mode but http mode won't change this behavior), telnet on the
listening address to keep the connection established, call systemctl
reload haproxy (or by hand with -Ds $oldchildspida), make some requests
and watch that a lot of them will block.


After this big explanation I tried to fix this closing all the
listeners on the parent before entering the waitpid loop (see attached
patch).

Thanks!




>From 1b879fb7daba4d8e69d4bed7e758d40c0a747c6f Mon Sep 17 00:00:00 2001
From: Simone Gotti <[email protected]>
Date: Mon, 9 Jun 2014 13:54:11 +0200
Subject: [PATCH] Fix unhandled connections problem with systemd daemon mode
 and SO_REUSEPORT.

Using the systemd daemon mode the parent doesn't exits but waits for
his childs without closing its listening sockets.

As linux 3.9 introduced a SO_REUSEPORT option (always enabled in
haproxy if available) this will give unhandled connections problems
after an haproxy reload with open connections.

The problem is that when on reload a new parent is started (-Ds
$oldchildspids), in haproxy.c main there's a call to start_proxies
that, without SO_REUSEPORT, should fail (as the old processes are
already listening) and so a SIGTOU is sent to old processes. On this
signal the old childs will call (in pause_listener) a shutdown() on
the listening fd. From my tests (if I understand it correctly) this
affects the in kernel file (so the listen is really disabled for all
the processes, also the parent).

Instead, with SO_REUSEPORT, the call to start_proxies doesn't fail and
so SIGTOU is never sent. Only SIGUSR1 is sent and the listen isn't
disabled for the parent but only the childs will stop listening (with
a call to close())

So, with SO_REUSEPORT, the old childs will close their listening
sockets but will wait for the current connections to finish or
timeout, and, as their parent has its listening socket open, the
kernel will schedule some connections on it. These connections will
never be accepted by the parent as it's in the waitpid loop.

This fix will close all the listeners on the parent before entering the
waitpid loop.

Signed-off-by: Simone Gotti <[email protected]>
---
 src/haproxy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/haproxy.c b/src/haproxy.c
index d8d8c61..6bfab06 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -1604,6 +1604,7 @@ int main(int argc, char **argv)
 
                if (proc == global.nbproc) {
                        if (global.mode & MODE_SYSTEMD) {
+                               protocol_unbind_all();
                                for (proc = 0; proc < global.nbproc; proc++)
                                        while (waitpid(children[proc], NULL, 0) 
== -1 && errno == EINTR);
                        }
-- 
1.9.3



Reply via email to