Hi Willy,

It seems I fixed the issues:

1. Already mentioned fix for epoll
2. Implemented posix_spawn() almost for all platform with fallback to
fork() (USE_POSIX_SPAWN).
   - A temporary solution to close all file descriptors (not efficient)
3. Implemented proper SOCK_CLOEXEC / FD_CLOEXEC on all open sockets
(USE_SOCK_CLOEXEC)

You can merge the changes from:
https://github.com/andvgal/haproxy/tree/20160614-fix-cloexec
Comparison against master:
https://github.com/andvgal/haproxy/compare/20160614-fix-cloexec?expand=1

I did as much testing as I could in scope of the problem. My original
test case with 10ms inter generates some false positive by itself -
that's where less than 1% of fails came from.


Patches are below:

>From eaa39496fd68afeff3b2cd1e4294518c3503db84 Mon Sep 17 00:00:00 2001
From: Andrey Galkin <[email protected]>
Date: Tue, 14 Jun 2016 03:25:11 +0300
Subject: [PATCH 1/3] BUG/MINOR: checks: fixed to create epoll with CLOEXEC

This is only part of fixed to general issue of leaking file descriptors
to child external-check processes and leading to connection data corruption
and disconnects.
---
 src/ev_epoll.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/ev_epoll.c b/src/ev_epoll.c
index ccb7c33..dbfded1 100644
--- a/src/ev_epoll.c
+++ b/src/ev_epoll.c
@@ -176,7 +176,7 @@ REGPRM1 static int _do_init(struct poller *p)
 {
     p->private = NULL;

-    epoll_fd = epoll_create(global.maxsock + 1);
+    epoll_fd = epoll_create1(EPOLL_CLOEXEC);
     if (epoll_fd < 0)
         goto fail_fd;

@@ -222,7 +222,7 @@ REGPRM1 static int _do_test(struct poller *p)
 {
     int fd;

-    fd = epoll_create(global.maxsock + 1);
+    fd = epoll_create1(EPOLL_CLOEXEC);
     if (fd < 0)
         return 0;
     close(fd);
@@ -239,7 +239,7 @@ REGPRM1 static int _do_fork(struct poller *p)
 {
     if (epoll_fd >= 0)
         close(epoll_fd);
-    epoll_fd = epoll_create(global.maxsock + 1);
+    epoll_fd = epoll_create1(EPOLL_CLOEXEC);
     if (epoll_fd < 0)
         return 0;
     return 1;
-- 
2.1.4

>From fd60e2493b6df8d5bdfa610261464095b7b2b070 Mon Sep 17 00:00:00 2001
From: Andrey Galkin <[email protected]>
Date: Tue, 14 Jun 2016 03:53:48 +0300
Subject: [PATCH 2/3] BUG/MEDIUM: checks: fixed to close all FDs on
 external-check OPTIM/MEDIUM: checks: conditional support of posix_spawn() for
 external-check

Calling fork() is suboptimal due VM management expense. posix_spawn() is
relatively thin call spawning a new process from clean image what can
make significant difference in heavy deployments.

posix_spawn() is controlled by new USE_POSIX_SPAWN setting which is enabled
by default for newer Linux, Solaris, OSX and *BSD.
---
 Makefile     | 13 +++++++++++++
 src/checks.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/Makefile b/Makefile
index 08d6108..02037ee 100644
--- a/Makefile
+++ b/Makefile
@@ -37,6 +37,7 @@
 #   USE_TFO              : enable TCP fast open. Supported on Linux >= 3.7.
 #   USE_NS               : enable network namespace support.
Supported on Linux >= 2.6.24.
 #   USE_DL               : enable it if your system requires -ldl.
Automatic on Linux.
+#   USE_POSIX_SPAWN      : enable posix_spawn() support instead of fork()
 #   USE_DEVICEATLAS      : enable DeviceAtlas api.
 #   USE_51DEGREES        : enable third party device detection
library from 51Degrees
 #
@@ -283,6 +284,7 @@ ifeq ($(TARGET),linux2628)
   ASSUME_SPLICE_WORKS= implicit
   EXTRA          += haproxy-systemd-wrapper
   USE_DL          = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),solaris)
   # This is for Solaris 8
@@ -294,6 +296,7 @@ ifeq ($(TARGET),solaris)
   USE_LIBCRYPT    = implicit
   USE_CRYPT_H     = implicit
   USE_GETADDRINFO = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),freebsd)
   # This is for FreeBSD
@@ -301,6 +304,7 @@ ifeq ($(TARGET),freebsd)
   USE_KQUEUE     = implicit
   USE_TPROXY     = implicit
   USE_LIBCRYPT   = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),osx)
   # This is for Mac OS/X
@@ -308,18 +312,21 @@ ifeq ($(TARGET),osx)
   USE_KQUEUE     = implicit
   USE_TPROXY     = implicit
   USE_LIBCRYPT   = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),openbsd)
   # This is for OpenBSD >= 3.0
   USE_POLL       = implicit
   USE_KQUEUE     = implicit
   USE_TPROXY     = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),netbsd)
   # This is for NetBSD
   USE_POLL       = implicit
   USE_KQUEUE     = implicit
   USE_TPROXY     = implicit
+  USE_POSIX_SPAWN = implicit
 else
 ifeq ($(TARGET),aix51)
   # This is for AIX 5.1
@@ -551,6 +558,12 @@ BUILD_OPTIONS   += $(call ignore_implicit,USE_DL)
 OPTIONS_LDFLAGS += -ldl
 endif

+ifneq ($(USE_POSIX_SPAWN),)
+OPTIONS_CFLAGS += -DUSE_POSIX_SPAWN
+BUILD_OPTIONS  += $(call ignore_implicit,USE_POSIX_SPAWN)
+endif
+
+
 # report DLMALLOC_SRC only if explicitly specified
 ifneq ($(DLMALLOC_SRC),)
 BUILD_OPTIONS += DLMALLOC_SRC=$(DLMALLOC_SRC)
diff --git a/src/checks.c b/src/checks.c
index c4ac947..3df25eb 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -22,6 +22,9 @@
 #include <string.h>
 #include <time.h>
 #include <unistd.h>
+#ifdef USE_POSIX_SPAWN
+#include <spawn.h>
+#endif /* USE_POSIX_SPAWN */
 #include <sys/socket.h>
 #include <sys/types.h>
 #include <sys/wait.h>
@@ -1821,10 +1824,43 @@ static int connect_proc_chk(struct task *t)
     struct proxy *px = s->proxy;
     int status;
     pid_t pid;
+    int fd;
+    /* In ideal situation, we should close all sockets with CLOEXEC.
+     * Unfortunately, tests show that closing only
stdin/stdout/stderr does not help.
+     */
+    int fd_limit = maxfd;
+#ifdef USE_POSIX_SPAWN
+    int spawn_res;
+    posix_spawn_file_actions_t spawn_file_actions;
+#endif /* USE_POSIX_SPAWN */

     status = SF_ERR_RESOURCE;

     block_sigchld();
+
+#ifdef USE_POSIX_SPAWN
+    /* Should not be a problem */
+    extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN,
ultoa_r(s->cur_sess, buf, sizeof(buf)));
+
+    /* The correct way is to open sockets with SOCK_CLOEXEC and/or
set FD_CLOEXEC later.
+       This code should be removed when all descriptors get proper CLOEXEC. */
+    posix_spawn_file_actions_init(&spawn_file_actions);
+
+    for (fd = 0; fd <= fd_limit; ++fd) {
+        posix_spawn_file_actions_addclose(&spawn_file_actions, fd);
+    }
+
+    spawn_res = posix_spawn(&pid, px->check_command,
&spawn_file_actions, NULL, check->argv, check->envp);
+    posix_spawn_file_actions_destroy(&spawn_file_actions);
+
+    if (spawn_res != 0) {
+        Alert("Failed to fork process for external health check: %s.
Aborting.\n",
+              strerror(spawn_res));
+        set_server_check_status(check, HCHK_STATUS_SOCKERR,
strerror(spawn_res));
+        goto out;
+    }
+
+#else /* USE_POSIX_SPAWN */

     pid = fork();
     if (pid < 0) {
@@ -1836,6 +1872,13 @@ static int connect_proc_chk(struct task *t)
     if (pid == 0) {
         /* Child */
         extern char **environ;
+
+        /* The correct way is to open sockets with SOCK_CLOEXEC
and/or set FD_CLOEXEC later.
+           This code should be removed when all descriptors get
proper CLOEXEC. */
+        for (fd = 0; fd <= fd_limit; ++fd) {
+            close(fd);
+        }
+
         environ = check->envp;
         extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN,
ultoa_r(s->cur_sess, buf, sizeof(buf)));
         execvp(px->check_command, check->argv);
@@ -1844,6 +1887,8 @@ static int connect_proc_chk(struct task *t)
         exit(-1);
     }

+#endif /* USE_POSIX_SPAWN */
+
     /* Parent */
     if (check->result == CHK_RES_UNKNOWN) {
         if (pid_list_add(pid, t) != NULL) {
-- 
2.1.4

>From 4949f2db93beb0d3b0a28a8dd0bffb0a226ba2ef Mon Sep 17 00:00:00 2001
From: Andrey Galkin <[email protected]>
Date: Tue, 14 Jun 2016 06:23:16 +0300
Subject: [PATCH 3/3] BUG/MEDIUM: checks/polls: implemented CLOEXEC on all FDs

Implemented efficient SOCK_CLOEXEC on socket() and accept() for Linux.
Implemented conditional fcntl() for other platforms.
---
 Makefile                 |  8 ++++++++
 include/common/accept4.h |  4 ++++
 include/proto/fd.h       | 25 +++++++++++++++++++++++++
 src/checks.c             | 23 +----------------------
 src/dns.c                |  5 ++++-
 src/listener.c           | 13 ++++++++++---
 src/log.c                |  6 +++++-
 src/namespace.c          |  7 ++++++-
 src/proto_uxst.c         | 18 +++++++++++++++---
 9 files changed, 78 insertions(+), 31 deletions(-)

diff --git a/Makefile b/Makefile
index 02037ee..775e257 100644
--- a/Makefile
+++ b/Makefile
@@ -38,6 +38,7 @@
 #   USE_NS               : enable network namespace support.
Supported on Linux >= 2.6.24.
 #   USE_DL               : enable it if your system requires -ldl.
Automatic on Linux.
 #   USE_POSIX_SPAWN      : enable posix_spawn() support instead of fork()
+#   USE_SOCK_CLOEXEC     : enable use of SOCK_CLOEXEC in socket() and
accept() calls
 #   USE_DEVICEATLAS      : enable DeviceAtlas api.
 #   USE_51DEGREES        : enable third party device detection
library from 51Degrees
 #
@@ -267,6 +268,7 @@ ifeq ($(TARGET),linux26)
   USE_FUTEX       = implicit
   EXTRA          += haproxy-systemd-wrapper
   USE_DL          = implicit
+  SOCK_CLOEXEC    = implicit
 else
 ifeq ($(TARGET),linux2628)
   # This is for standard Linux >= 2.6.28 with netfilter, epoll,
tproxy and splice
@@ -285,6 +287,7 @@ ifeq ($(TARGET),linux2628)
   EXTRA          += haproxy-systemd-wrapper
   USE_DL          = implicit
   USE_POSIX_SPAWN = implicit
+  USE_SOCK_CLOEXEC = implicit
 else
 ifeq ($(TARGET),solaris)
   # This is for Solaris 8
@@ -563,6 +566,11 @@ OPTIONS_CFLAGS += -DUSE_POSIX_SPAWN
 BUILD_OPTIONS  += $(call ignore_implicit,USE_POSIX_SPAWN)
 endif

+ifneq ($(USE_SOCK_CLOEXEC),)
+OPTIONS_CFLAGS += -DUSE_SOCK_CLOEXEC
+BUILD_OPTIONS  += $(call ignore_implicit,USE_SOCK_CLOEXEC)
+endif
+

 # report DLMALLOC_SRC only if explicitly specified
 ifneq ($(DLMALLOC_SRC),)
diff --git a/include/common/accept4.h b/include/common/accept4.h
index 8981b92..cee84d9 100644
--- a/include/common/accept4.h
+++ b/include/common/accept4.h
@@ -40,6 +40,10 @@
 #ifndef SOCK_NONBLOCK
 #define SOCK_NONBLOCK O_NONBLOCK
 #endif
+#ifndef SOCK_CLOEXEC
+#define SOCK_CLOEXEC O_CLOEXEC
+#endif
+

 #if defined(USE_MY_ACCEPT4) || (!defined(SYS_ACCEPT4) &&
!defined(__NR_accept4))
 #if defined(CONFIG_HAP_LINUX_VSYSCALL) && defined(__linux__) &&
defined(__i386__)
diff --git a/include/proto/fd.h b/include/proto/fd.h
index 87309bf..01ef803 100644
--- a/include/proto/fd.h
+++ b/include/proto/fd.h
@@ -26,6 +26,7 @@
 #include <sys/time.h>
 #include <sys/types.h>
 #include <unistd.h>
+#include <fcntl.h>

 #include <common/config.h>
 #include <types/fd.h>
@@ -339,6 +340,30 @@ static inline void fd_insert(int fd)
         maxfd = fd + 1;
 }

+/* Set FD_CLOEXEC on file descriptor to avoid its leakage into children */
+static inline void fd_set_cloexec(int fd)
+{
+    fcntl(fd, F_SETFD, FD_CLOEXEC);
+}
+
+/* Add SOCK_CLOEXEC, if enabled */
+static inline int sock_type_cloexec(int sock_type)
+{
+#ifdef USE_SOCK_CLOEXEC
+    return sock_type|SOCK_CLOEXEC;
+#else /* USE_SOCK_CLOEXEC */
+    return sock_type;
+#endif /* USE_SOCK_CLOEXEC */
+}
+
+/* Call fd_set_cloexec, SOCK_CLOEXEC is not enabled*/
+static inline void fd_cond_set_cloexec(int fd)
+{
+#ifndef USE_SOCK_CLOEXEC
+    fd_set_cloexec(fd);
+#endif /* USE_SOCK_CLOEXEC */
+}
+

 #endif /* _PROTO_FD_H */

diff --git a/src/checks.c b/src/checks.c
index 3df25eb..5354ac6 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -1824,14 +1824,8 @@ static int connect_proc_chk(struct task *t)
     struct proxy *px = s->proxy;
     int status;
     pid_t pid;
-    int fd;
-    /* In ideal situation, we should close all sockets with CLOEXEC.
-     * Unfortunately, tests show that closing only
stdin/stdout/stderr does not help.
-     */
-    int fd_limit = maxfd;
 #ifdef USE_POSIX_SPAWN
     int spawn_res;
-    posix_spawn_file_actions_t spawn_file_actions;
 #endif /* USE_POSIX_SPAWN */

     status = SF_ERR_RESOURCE;
@@ -1842,16 +1836,7 @@ static int connect_proc_chk(struct task *t)
     /* Should not be a problem */
     extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN,
ultoa_r(s->cur_sess, buf, sizeof(buf)));

-    /* The correct way is to open sockets with SOCK_CLOEXEC and/or
set FD_CLOEXEC later.
-       This code should be removed when all descriptors get proper CLOEXEC. */
-    posix_spawn_file_actions_init(&spawn_file_actions);
-
-    for (fd = 0; fd <= fd_limit; ++fd) {
-        posix_spawn_file_actions_addclose(&spawn_file_actions, fd);
-    }
-
-    spawn_res = posix_spawn(&pid, px->check_command,
&spawn_file_actions, NULL, check->argv, check->envp);
-    posix_spawn_file_actions_destroy(&spawn_file_actions);
+    spawn_res = posix_spawn(&pid, px->check_command, NULL, NULL,
check->argv, check->envp);

     if (spawn_res != 0) {
         Alert("Failed to fork process for external health check: %s.
Aborting.\n",
@@ -1873,12 +1858,6 @@ static int connect_proc_chk(struct task *t)
         /* Child */
         extern char **environ;

-        /* The correct way is to open sockets with SOCK_CLOEXEC
and/or set FD_CLOEXEC later.
-           This code should be removed when all descriptors get
proper CLOEXEC. */
-        for (fd = 0; fd <= fd_limit; ++fd) {
-            close(fd);
-        }
-
         environ = check->envp;
         extchk_setenv(check, EXTCHK_HAPROXY_SERVER_CURCONN,
ultoa_r(s->cur_sess, buf, sizeof(buf)));
         execvp(px->check_command, check->argv);
diff --git a/src/dns.c b/src/dns.c
index 3b3dfc5..cc35ea7 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -918,13 +918,16 @@ int dns_init_resolvers(void)
             dgram->data = &resolve_dgram_cb;

             /* create network UDP socket for this nameserver */
-            if ((fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1) {
+            if ((fd = socket(AF_INET, sock_type_cloexec(SOCK_DGRAM),
IPPROTO_UDP)) == -1) {
                 Alert("Starting [%s/%s] nameserver: can't create
socket.\n", curr_resolvers->id,
                         curnameserver->id);
                 free(dgram);
                 dgram = NULL;
                 return 0;
             }
+
+            /* avoid the socket getting affected by child processes */
+            fd_cond_set_cloexec(fd);

             /* "connect" the UDP socket to the name server IP */
             if (connect(fd, (struct sockaddr*)&curnameserver->addr,
get_addr_len(&curnameserver->addr)) == -1) {
diff --git a/src/listener.c b/src/listener.c
index 59385f0..7f0e8d6 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -387,12 +387,19 @@ void listener_accept(int fd)
          * fallback to the legacy accept() + fcntl().
          */
         if (unlikely(accept4_broken ||
-            ((cfd = accept4(fd, (struct sockaddr *)&addr, &laddr,
SOCK_NONBLOCK)) == -1 &&
+            ((cfd = accept4(fd, (struct sockaddr *)&addr, &laddr,
SOCK_NONBLOCK|SOCK_CLOEXEC)) == -1 &&
             (errno == ENOSYS || errno == EINVAL || errno == EBADF) &&
-            (accept4_broken = 1))))
+            (accept4_broken = 1)))) {
 #endif
-            if ((cfd = accept(fd, (struct sockaddr *)&addr, &laddr)) != -1)
+            if ((cfd = accept(fd, (struct sockaddr *)&addr, &laddr)) != -1) {
+                fd_set_cloexec(cfd);
                 fcntl(cfd, F_SETFL, O_NONBLOCK);
+            }
+#ifdef USE_ACCEPT4
+        } else {
+            fd_cond_set_cloexec(cfd);
+        }
+#endif

         if (unlikely(cfd == -1)) {
             switch (errno) {
diff --git a/src/log.c b/src/log.c
index 99c89f3..45dae0b 100644
--- a/src/log.c
+++ b/src/log.c
@@ -31,6 +31,7 @@
 #include <types/global.h>
 #include <types/log.h>

+#include <proto/fd.h>
 #include <proto/frontend.h>
 #include <proto/proto_http.h>
 #include <proto/log.h>
@@ -1103,12 +1104,15 @@ void __send_log(struct proxy *p, int level,
char *message, size_t size, char *sd
         if (unlikely(*plogfd < 0)) {
             /* socket not successfully initialized yet */
             int proto = logsrv->addr.ss_family == AF_UNIX ? 0 : IPPROTO_UDP;
+            int sock_type = sock_type_cloexec(SOCK_DGRAM);

-            if ((*plogfd = socket(logsrv->addr.ss_family, SOCK_DGRAM,
proto)) < 0) {
+            if ((*plogfd = socket(logsrv->addr.ss_family, sock_type,
proto)) < 0) {
                 Alert("socket for logger #%d failed: %s (errno=%d)\n",
                       nblogger, strerror(errno), errno);
                 continue;
             }
+            /* avoid the socket getting affected by child processes */
+            fd_cond_set_cloexec(*plogfd);
             /* we don't want to receive anything on this socket */
             setsockopt(*plogfd, SOL_SOCKET, SO_RCVBUF, &zero, sizeof(zero));
             /* does nothing under Linux, maybe needed for others */
diff --git a/src/namespace.c b/src/namespace.c
index e5ebfd7..4f078af 100644
--- a/src/namespace.c
+++ b/src/namespace.c
@@ -14,6 +14,7 @@
 #include <common/compiler.h>
 #include <common/hash.h>
 #include <common/errors.h>
+#include <proto/fd.h>
 #include <proto/log.h>
 #include <types/global.h>

@@ -101,7 +102,11 @@ int my_socketat(const struct netns_entry *ns, int
domain, int type, int protocol
     if (default_namespace >= 0 && ns && setns(ns->fd, CLONE_NEWNET) == -1)
         return -1;
 #endif
-    sock = socket(domain, type, protocol);
+    sock = socket(domain, sock_type_cloexec(type), protocol);
+
+    /* avoid the socket getting affected by child processes */
+    fd_cond_set_cloexec(sock);
+

 #ifdef CONFIG_HAP_NS
     if (default_namespace >= 0 && ns && setns(default_namespace,
CLONE_NEWNET) == -1) {
diff --git a/src/proto_uxst.c b/src/proto_uxst.c
index b2a7fe2..537b6ce 100644
--- a/src/proto_uxst.c
+++ b/src/proto_uxst.c
@@ -127,10 +127,13 @@ static void destroy_uxst_socket(const char *path)
      * ECONNREFUSED. In this case, we do not touch it because it's used
      * by some other process.
      */
-    sock = socket(PF_UNIX, SOCK_DGRAM, 0);
+    sock = socket(PF_UNIX, sock_type_cloexec(SOCK_DGRAM), 0);
     if (sock < 0)
         return;

+    /* avoid the socket getting affected by child processes */
+    fd_cond_set_cloexec(sock);
+
     addr.sun_family = AF_UNIX;
     strncpy(addr.sun_path, path, sizeof(addr.sun_path));
     addr.sun_path[sizeof(addr.sun_path) - 1] = 0;
@@ -237,13 +240,16 @@ static int uxst_bind_listener(struct listener
*listener, char *errmsg, int errle
     }
     addr.sun_family = AF_UNIX;

-    fd = socket(PF_UNIX, SOCK_STREAM, 0);
+    fd = socket(PF_UNIX, sock_type_cloexec(SOCK_STREAM), 0);
     if (fd < 0) {
         err |= ERR_FATAL | ERR_ALERT;
         msg = "cannot create UNIX socket";
         goto err_unlink_back;
     }

+    /* avoid the socket getting affected by child processes */
+    fd_cond_set_cloexec(fd);
+
  fd_ready:
     if (fd >= global.maxsock) {
         err |= ERR_FATAL | ERR_ALERT;
@@ -415,6 +421,7 @@ int uxst_pause_listener(struct listener *l)
 int uxst_connect_server(struct connection *conn, int data, int delack)
 {
     int fd;
+    int sock_type;
     struct server *srv;
     struct proxy *be;

@@ -434,7 +441,9 @@ int uxst_connect_server(struct connection *conn,
int data, int delack)
         return SF_ERR_INTERNAL;
     }

-    if ((fd = conn->t.sock.fd = socket(PF_UNIX, SOCK_STREAM, 0)) == -1) {
+    sock_type = sock_type_cloexec(SOCK_STREAM);
+
+    if ((fd = conn->t.sock.fd = socket(PF_UNIX, sock_type, 0)) == -1) {
         qfprintf(stderr, "Cannot get a server socket.\n");

         if (errno == ENFILE) {
@@ -466,6 +475,9 @@ int uxst_connect_server(struct connection *conn,
int data, int delack)
         return SF_ERR_RESOURCE;
     }

+    /* avoid the socket getting affected by child processes */
+    fd_cond_set_cloexec(fd);
+
     if (fd >= global.maxsock) {
         /* do not log anything there, it's a normal condition when this option
          * is used to serialize connections to a server !
-- 
2.1.4



Thanks,
Andrey Galkin

On Tue, Jun 14, 2016 at 1:55 AM, Andrey G. <[email protected]> wrote:
> Hi Willy,
>
> I believe your workaround fits socket data pollution, but it does not
> really help with epoll connection drop. However, instead of forcibly
> closing the sockets on each fork(), perhaps, setting FD_CLOEXEC right
> after accept()/socket() is much safer and potentially better
> performing with large number of descriptors and external-checks.
> Alternative way, is to try to use posix_spawn().
>
> Also,I have noticed around 0.6% of dropped connections which can be a
> consequence of race condition on close() in child. I'll try to look
> more.
>
> Regarding, the epoll connection drop, it seems the issue comes from
> epoll FD being shared with child process and therefore affected by
> changes in it.
> Below is patch which fixes the issues remaining after close()
> workaround (at least in my tests). Note: maxsock is unused parameter -
> it is used to be utilized only as a hint in pre 2.6.8 kernels.
>
> Just in case, I contribute the patch according to LICENSE and yes, I
> am a bit violating CONTRIBUTING rules. Sorry :(
> NOTE: it also requires a separate socket data pollution fix as well!
> --- haproxy-1.6.5.orig/src/ev_epoll.c   2016-05-10 16:42:00.000000000 +0300
> +++ haproxy-1.6.5/src/ev_epoll.c        2016-06-14 01:21:45.957098610 +0300
> @@ -176,7 +176,7 @@ REGPRM1 static int _do_init(struct polle
>  {
>         p->private = NULL;
>
> -       epoll_fd = epoll_create(global.maxsock + 1);
> +       epoll_fd = epoll_create1(EPOLL_CLOEXEC);
>         if (epoll_fd < 0)
>                 goto fail_fd;
>
> @@ -222,7 +222,7 @@ REGPRM1 static int _do_test(struct polle
>  {
>         int fd;
>
> -       fd = epoll_create(global.maxsock + 1);
> +       fd = epoll_create1(EPOLL_CLOEXEC);
>         if (fd < 0)
>                 return 0;
>         close(fd);
> @@ -239,7 +239,7 @@ REGPRM1 static int _do_fork(struct polle
>  {
>         if (epoll_fd >= 0)
>                 close(epoll_fd);
> -       epoll_fd = epoll_create(global.maxsock + 1);
> +       epoll_fd = epoll_create1(EPOLL_CLOEXEC);
>         if (epoll_fd < 0)
>                 return 0;
>         return 1;
>
> Thanks,
> Andrey Galkin
> Андрей
>
>
> On Mon, Jun 13, 2016 at 10:35 PM, Willy Tarreau <[email protected]> wrote:
>> Hi Andrey,
>>
>> On Sun, Jun 12, 2016 at 04:15:49PM +0300, Andrey G. wrote:
>>> Hi,
>>>
>>> I am seeing quite strange and not simple to reproduce behavior found
>>> during stress testing of https://github.com/codingfuture/puppet-cfdb .
>>> After a long testing with different options, I am quite sure the issue
>>> comes from epoll support as "noepoll" fixes the problem.
>>>
>>> Here's the scenario:
>>>
>>> 1. There are several DB clusters accessible through TCP
>>> 2. HAProxy 1.6.5 on Debian Jessie accepts clients on UNIX sockets
>>> under /run/{service}/ (easier auto-configuration, security and per
>>> role maxconn control)
>>> 3. external-check python-based (yep..) scripts are used to properly
>>> identify ready nodes in clusters
>>> 4. As cluster connections may be wrapped in TLS, external-check
>>> scripts use dedicated no-check "listen" blocks in HAProxy, but do not
>>> connect directly.
>>> 5. As there is a known recently discovered issue with stdout/stderr
>>> data clobbering, I made sure external-check scripts to close mentioned
>>> descriptors (the example below uses /dev/null instead).
>>
>> I think the issue is very similar. In fact, closing stdin/stdout is not
>> enough, we need to close *all* FDs there. epoll uses a specific FD and
>> I suspect that it is what is causing the issue. An alternate reason could
>> be that its presence simply shifts another FD by 1 and makes it conflict
>> with one FD used by the external check. Normal scripts should not assume
>> that fds above 2 may be used without initialization. But note that it is
>> also possible that epoll is assigned fd #2 and causes all the trouble.
>>
>> Thus now I'd do this after the fork for external checks instead in
>> src/checks.c to guarantee we close all open FDs :
>>
>>         if (pid == 0) {
>>                 /* Child */
>>                 extern char **environ;
>> +               int fd;
>> +
>> +               for (fd = 0; fd <= maxfd; fd++)
>> +                        close(fd);
>>
>> Is it something you could check ? By the way, thanks a lot for your bug
>> report, it's very well detailed!
>>
>> Willy

Reply via email to