Re: ModSecurity: First integration patches
Am 11-04-2017 10:49, schrieb Thierry Fournier: Hi list I join one usage of HAProxy / SPOE, it is WAF offloading. These patches are a first version, it have some limitations describe in the README file in the directory contrib/modsecurity. - Christopher, please check the patch "BUG/MINOR", it is about spoe functions. - The exemple of ModSecurity compilation can be improved. It is based on my local distro. The feedback are welcome. I agree with Olivier that's really great stuff ;-) To which branch (I assume master) can I apply this patch? I will then create a docker image for testing ;-) Thierry Aleks
Re: ModSecurity: First integration patches
> On 12 Apr 2017, at 09:57, Aleksandar Lazic wrote: > > > > Am 11-04-2017 10:49, schrieb Thierry Fournier: >> Hi list >> I join one usage of HAProxy / SPOE, it is WAF offloading. >> These patches are a first version, it have some limitations describe >> in the README file in the directory contrib/modsecurity. >> - Christopher, please check the patch "BUG/MINOR", it is about spoe >> functions. >> - The exemple of ModSecurity compilation can be improved. It is based >> on my local distro. >> The feedback are welcome. > > I agree with Olivier that's really great stuff ;-) thanks. > To which branch (I assume master) can I apply this patch? You’re right. The core haproxy patch is very light, I guess that the patch will be applied on any branch from 1.7.0 (needs SPOE) > I will then create a docker image for testing ;-) Great ! >> Thierry > > Aleks
Re: ModSecurity: First integration patches
Le 11/04/2017 à 10:49, Thierry Fournier a écrit : Hi list I join one usage of HAProxy / SPOE, it is WAF offloading. These patches are a first version, it have some limitations describe in the README file in the directory contrib/modsecurity. - Christopher, please check the patch "BUG/MINOR", it is about spoe functions. - The exemple of ModSecurity compilation can be improved. It is based on my local distro. The feedback are welcome. Hi Thierry, Really nice ! I'll take a look at it soon. Glad to see the first service that uses the SPOE ! Good job. -- Christopher Faulet
Re: [RFC][PATCHES] seamless reload
Hi Olivier, I was very eager to try out your patch set, thanks a lot! However, after applying all of them (including the last three), it seems that using environment variables in the config is broken (i.e. ${VARNAME} does not get replaced with the value of the environment variable anymore). I am using the systemd-wrapper, but not systemd. Just from looking at the patches I couldn't make out what exactly might be the problem. I guess it could be either the wrapper not passing on its environment properly or haproxy not using it properly anymore. If you have any immediate ideas, let me know. I'll try to fully debug this when I can spare the time. Thanks a lot, Conrad On 04/10/2017 08:09 PM, Olivier Houchard wrote: > > Hi, > > On top of those patches, here a 3 more patches. > The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET > environment variable. If set, it will use that as an argument to -x, when > reloading the process. > The second one sends listening unix sockets, as well as IPv4/v6 sockets. > I see no reason not to, and that means we no longer have to wait until > the old process close the socket before being able to accept new connections > on it. > The third one adds a new global optoin, nosockettransfer, if set, we assume > we will never try to transfer listening sockets through the stats socket, > and close any socket nout bound to our process, to save a few file > descriptors. > > Regards, > > Olivier > -- Conrad Hoffmann Traffic Engineer SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany Managing Director: Alexander Ljung | Incorporated in England & Wales with Company No. 6343600 | Local Branch Office | AG Charlottenburg | HRB 110657B
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote: > Hi Olivier, > > I was very eager to try out your patch set, thanks a lot! However, after > applying all of them (including the last three), it seems that using > environment variables in the config is broken (i.e. ${VARNAME} does not get > replaced with the value of the environment variable anymore). I am using > the systemd-wrapper, but not systemd. > > Just from looking at the patches I couldn't make out what exactly might be > the problem. I guess it could be either the wrapper not passing on its > environment properly or haproxy not using it properly anymore. If you have > any immediate ideas, let me know. I'll try to fully debug this when I can > spare the time. > > Thanks a lot, > Conrad > Hi Conrad, You're right, that was just me being an idiot :) Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch with the one attached, or just replace in src/haproxy-systemd-wrapper.c : argv = calloc(4 + main_argc + nb_pid + 1 + stats_socket != NULL ? 2 : 0, sizeof(char *)); by : argv = calloc(4 + main_argc + nb_pid + 1 + (stats_socket != NULL ? 2 : 0), sizeof(char *)); Regards, Olivier >From 526dca943b9cc89732c54bc43a6ce36e17b67890 Mon Sep 17 00:00:00 2001 From: Olivier Houchard Date: Sun, 9 Apr 2017 16:28:10 +0200 Subject: [PATCH 7/9] MINOR: systemd wrapper: add support for passing the -x option. Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set. If set, it will use it as an argument to the "-x" option, which makes haproxy asks for any listening socket, on the stats socket, in order to achieve reloads with no new connection lost. --- contrib/systemd/haproxy.service.in | 2 ++ src/haproxy-systemd-wrapper.c | 10 +- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/contrib/systemd/haproxy.service.in b/contrib/systemd/haproxy.service.in index dca81a2..05bb716 100644 --- a/contrib/systemd/haproxy.service.in +++ b/contrib/systemd/haproxy.service.in @@ -3,6 +3,8 @@ Description=HAProxy Load Balancer After=network.target [Service] +# You can point the environment variable HAPROXY_STATS_SOCKET to a stats +# socket if you want seamless reloads. Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid" ExecStartPre=@SBINDIR@/haproxy -f $CONFIG -c -q ExecStart=@SBINDIR@/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c index f6a9c85..457f5bd 100644 --- a/src/haproxy-systemd-wrapper.c +++ b/src/haproxy-systemd-wrapper.c @@ -92,11 +92,15 @@ static void spawn_haproxy(char **pid_strv, int nb_pid) pid = fork(); if (!pid) { char **argv; + char *stats_socket = NULL; int i; int argno = 0; /* 3 for "haproxy -Ds -sf" */ - argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *)); + if (nb_pid > 0) + stats_socket = getenv("HAPROXY_STATS_SOCKET"); + argv = calloc(4 + main_argc + nb_pid + 1 + + (stats_socket != NULL ? 2 : 0), sizeof(char *)); if (!argv) { fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: failed to calloc(), please try again later.\n"); exit(1); @@ -121,6 +125,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid) argv[argno++] = "-sf"; for (i = 0; i < nb_pid; ++i) argv[argno++] = pid_strv[i]; + if (stats_socket != NULL) { + argv[argno++] = "-x"; + argv[argno++] = stats_socket; + } } argv[argno] = NULL; -- 2.9.3
Re: [RFC][PATCHES] seamless reload
On 04/12/2017 03:37 PM, Olivier Houchard wrote: > On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote: >> Hi Olivier, >> >> I was very eager to try out your patch set, thanks a lot! However, after >> applying all of them (including the last three), it seems that using >> environment variables in the config is broken (i.e. ${VARNAME} does not get >> replaced with the value of the environment variable anymore). I am using >> the systemd-wrapper, but not systemd. >> >> Just from looking at the patches I couldn't make out what exactly might be >> the problem. I guess it could be either the wrapper not passing on its >> environment properly or haproxy not using it properly anymore. If you have >> any immediate ideas, let me know. I'll try to fully debug this when I can >> spare the time. >> >> Thanks a lot, >> Conrad >> > > Hi Conrad, > > You're right, that was just me being an idiot :) > Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch > with the one attached, or just replace in src/haproxy-systemd-wrapper.c : > > argv = calloc(4 + main_argc + nb_pid + 1 + > stats_socket != NULL ? 2 : 0, sizeof(char *)); > by : > argv = calloc(4 + main_argc + nb_pid + 1 + > (stats_socket != NULL ? 2 : 0), sizeof(char *)); > > Regards, > > Olivier Works indeed, hard to spot :) Thanks a lot, I'll now get back to testing the actual reloads! Conrad -- Conrad Hoffmann Traffic Engineer SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany Managing Director: Alexander Ljung | Incorporated in England & Wales with Company No. 6343600 | Local Branch Office | AG Charlottenburg | HRB 110657B
Re: [RFC][PATCHES] seamless reload
Hi again, so I tried to get this to work, but didn't manage yet. I also don't quite understand how this is supposed to work. The first haproxy process is started _without_ the -x option, is that correct? Where does that instance ever create the socket for transfer to later instances? I have it working now insofar that on reload, subsequent instances are spawned with the -x option, but they'll just complain that they can't get anything from the unix socket (because, for all I can tell, it's not there?). I also can't see the relevant code path where this socket gets created, but I didn't have time to read all of it yet. Am I doing something wrong? Did anyone get this to work with the systemd-wrapper so far? Also, but this might be a coincidence, my test setup takes a huge performance penalty just by applying your patches (without any reloading whatsoever). Did this happen to anybody else? I'll send some numbers and more details tomorrow. Thanks a lot, Conrad On 04/12/2017 03:47 PM, Conrad Hoffmann wrote: > On 04/12/2017 03:37 PM, Olivier Houchard wrote: >> On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote: >>> Hi Olivier, >>> >>> I was very eager to try out your patch set, thanks a lot! However, after >>> applying all of them (including the last three), it seems that using >>> environment variables in the config is broken (i.e. ${VARNAME} does not get >>> replaced with the value of the environment variable anymore). I am using >>> the systemd-wrapper, but not systemd. >>> >>> Just from looking at the patches I couldn't make out what exactly might be >>> the problem. I guess it could be either the wrapper not passing on its >>> environment properly or haproxy not using it properly anymore. If you have >>> any immediate ideas, let me know. I'll try to fully debug this when I can >>> spare the time. >>> >>> Thanks a lot, >>> Conrad >>> >> >> Hi Conrad, >> >> You're right, that was just me being an idiot :) >> Please replace >> 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch >> with the one attached, or just replace in src/haproxy-systemd-wrapper.c : >> >> argv = calloc(4 + main_argc + nb_pid + 1 + >> stats_socket != NULL ? 2 : 0, sizeof(char *)); >> by : >> argv = calloc(4 + main_argc + nb_pid + 1 + >> (stats_socket != NULL ? 2 : 0), sizeof(char *)); >> >> Regards, >> >> Olivier > > > Works indeed, hard to spot :) > > Thanks a lot, I'll now get back to testing the actual reloads! > > Conrad > -- Conrad Hoffmann Traffic Engineer SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany Managing Director: Alexander Ljung | Incorporated in England & Wales with Company No. 6343600 | Local Branch Office | AG Charlottenburg | HRB 110657B
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote: > Hi again, > > so I tried to get this to work, but didn't manage yet. I also don't quite > understand how this is supposed to work. The first haproxy process is > started _without_ the -x option, is that correct? Where does that instance > ever create the socket for transfer to later instances? > > I have it working now insofar that on reload, subsequent instances are > spawned with the -x option, but they'll just complain that they can't get > anything from the unix socket (because, for all I can tell, it's not > there?). I also can't see the relevant code path where this socket gets > created, but I didn't have time to read all of it yet. > > Am I doing something wrong? Did anyone get this to work with the > systemd-wrapper so far? > You're right, the first one runs without -x. The socket used is just any stats socket. > Also, but this might be a coincidence, my test setup takes a huge > performance penalty just by applying your patches (without any reloading > whatsoever). Did this happen to anybody else? I'll send some numbers and > more details tomorrow. > Ah this is news to me, I did not expect a performances impact, can you share your configuration file ? Thanks ! Olivier
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote: > Hi again, > > so I tried to get this to work, but didn't manage yet. I also don't quite > understand how this is supposed to work. The first haproxy process is > started _without_ the -x option, is that correct? Where does that instance > ever create the socket for transfer to later instances? > > I have it working now insofar that on reload, subsequent instances are > spawned with the -x option, but they'll just complain that they can't get > anything from the unix socket (because, for all I can tell, it's not > there?). I also can't see the relevant code path where this socket gets > created, but I didn't have time to read all of it yet. > > Am I doing something wrong? Did anyone get this to work with the > systemd-wrapper so far? > > Also, but this might be a coincidence, my test setup takes a huge > performance penalty just by applying your patches (without any reloading > whatsoever). Did this happen to anybody else? I'll send some numbers and > more details tomorrow. > Ok I can confirm the performance issues, I'm investigating. Olivier
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote: > On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote: > > Hi again, > > > > so I tried to get this to work, but didn't manage yet. I also don't quite > > understand how this is supposed to work. The first haproxy process is > > started _without_ the -x option, is that correct? Where does that instance > > ever create the socket for transfer to later instances? > > > > I have it working now insofar that on reload, subsequent instances are > > spawned with the -x option, but they'll just complain that they can't get > > anything from the unix socket (because, for all I can tell, it's not > > there?). I also can't see the relevant code path where this socket gets > > created, but I didn't have time to read all of it yet. > > > > Am I doing something wrong? Did anyone get this to work with the > > systemd-wrapper so far? > > > > Also, but this might be a coincidence, my test setup takes a huge > > performance penalty just by applying your patches (without any reloading > > whatsoever). Did this happen to anybody else? I'll send some numbers and > > more details tomorrow. > > > > Ok I can confirm the performance issues, I'm investigating. > Found it, I was messing with SO_LINGER when I shouldn't have been. Can you try the new version of 0003-MINOR-tcp-When-binding-socket-attempt-to-reuse-one-f.patch or replace, in src/proto_tcp.c : if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 && (tmplinger.l_onoff == 0 || tmplinger.l_linger == 0)) { /* Attempt to activate SO_LINGER, not sure what a sane * value is, using the default BSD value of 120s. */ tmplinger.l_onoff = 1; tmplinger.l_linger = 120; setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, sizeof(tmplinger)); } by : if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 && (tmplinger.l_onoff == 1 || tmplinger.l_linger == 0)) { tmplinger.l_onoff = 0; tmplinger.l_linger = 0; setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, sizeof(tmplinger)); } } Thanks ! Olivier >From 1bf2b9c550285b434c865adeb175277ce00c842b Mon Sep 17 00:00:00 2001 From: Olivier Houchard Date: Wed, 5 Apr 2017 22:39:56 +0200 Subject: [PATCH 3/9] MINOR: tcp: When binding socket, attempt to reuse one from the old proc. Try to reuse any socket from the old process, provided by the "-x" flag, before binding a new one, assuming it is compatible. "Compatible" here means same address and port, same namspace if any, same interface if any, and that the following flags are the same : LI_O_FOREIGN, LI_O_V6ONLY and LI_O_V4V6. Also change tcp_bind_listener() to always enable/disable socket options, instead of just doing so if it is in the configuration file, as the option may have been removed, ie TCP_FASTOPEN may have been set in the old process, and removed from the new configuration, so we have to disable it. --- src/proto_tcp.c | 119 +++- 1 file changed, 117 insertions(+), 2 deletions(-) diff --git a/src/proto_tcp.c b/src/proto_tcp.c index 5e12b99..ea6b8f7 100644 --- a/src/proto_tcp.c +++ b/src/proto_tcp.c @@ -731,6 +731,83 @@ int tcp_connect_probe(struct connection *conn) return 0; } +/* XXX: Should probably be elsewhere */ +static int compare_sockaddr(struct sockaddr_storage *a, struct sockaddr_storage *b) +{ + if (a->ss_family != b->ss_family) { + return (-1); + } + switch (a->ss_family) { + case AF_INET: + { + struct sockaddr_in *a4 = (void *)a, *b4 = (void *)b; + if (a4->sin_port != b4->sin_port) + return (-1); + return (memcmp(&a4->sin_addr, &b4->sin_addr, + sizeof(a4->sin_addr))); + } + case AF_INET6: + { + struct sockaddr_in6 *a6 = (void *)a, *b6 = (void *)b; + if (a6->sin6_port != b6->sin6_port) + return (-1); + return (memcmp(&a6->sin6_addr, &b6->sin6_addr, + sizeof(a6->sin6_addr))); + } + default: + return (-1); + } + +} + +#define LI_MANDATORY_FLAGS (LI_O_FOREIGN | LI_O_V6ONLY | LI_O_V4V6) +/* When binding the listeners, check if a socket has been sent to us by the + * previous process that we could reuse, instead of creating a new one. + */ +static int tcp_find_compatible_fd(struct listener *l) +{ + struct xfer_sock_list *xfer_sock = xfer_sock_list; + int ret = -1; + +
Re: [RFC][PATCHES] seamless reload
Yet another patch, on top of the previous ones. This one tries to get the default value of TCP_MAXSEG by creating a temporary TCP socket, so that one can remove the "mss" entry from its configuration file, and reset the mss value for any transferred socket from the old process. Olivier >From 7dc2432f3a7c4a9e9531adafa4524a199e394f90 Mon Sep 17 00:00:00 2001 From: Olivier Houchard Date: Wed, 12 Apr 2017 19:32:15 +0200 Subject: [PATCH 10/10] MINOR: tcp: Attempt to reset TCP_MAXSEG when reusing a socket. Guess the default value for TCP_MAXSEG by binding a temporary TCP socket and getting it, and use that in case the we're reusing a socket from the old process, and its TCP_MAXSEG is different. That way one can reset the TCP_MAXSEG value to the default one when reusing sockets. --- src/proto_tcp.c | 55 --- 1 file changed, 52 insertions(+), 3 deletions(-) diff --git a/src/proto_tcp.c b/src/proto_tcp.c index ea6b8f7..8050f3e 100644 --- a/src/proto_tcp.c +++ b/src/proto_tcp.c @@ -110,6 +110,12 @@ static struct protocol proto_tcpv6 = { .nb_listeners = 0, }; +/* Default TCP parameters, got by opening a temporary TCP socket. */ +#ifdef TCP_MAXSEG +static int default_tcp_maxseg = -1; +static int default_tcp6_maxseg = -1; +#endif + /* Binds ipv4/ipv6 address to socket , unless is set, in which * case we try to bind . is a 2-bit field consisting of : * - 0 : ignore remote address (may even be a NULL pointer) @@ -829,6 +835,35 @@ int tcp_bind_listener(struct listener *listener, char *errmsg, int errlen) int ext, ready; socklen_t ready_len; const char *msg = NULL; +#ifdef TCP_MAXSEG + + /* Create a temporary TCP socket to get default parameters we can't +* guess. +* */ + ready_len = sizeof(default_tcp_maxseg); + if (default_tcp_maxseg == -1) { + default_tcp_maxseg = -2; + fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); + if (fd < 0) + Warning("Failed to create a temporary socket!\n"); + else { + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &default_tcp_maxseg, + &ready_len) == -1) + Warning("Failed to get the default value of TCP_MAXSEG\n"); + } + } + if (default_tcp6_maxseg == -1) { + default_tcp6_maxseg = -2; + fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP); + if (fd >= 0) { + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &default_tcp6_maxseg, + &ready_len) == -1) + Warning("Failed ot get the default value of TCP_MAXSEG for IPv6\n"); + close(fd); + } + } +#endif + /* ensure we never return garbage */ if (errlen) @@ -960,10 +995,24 @@ int tcp_bind_listener(struct listener *listener, char *errmsg, int errlen) msg = "cannot set MSS"; err |= ERR_WARN; } + } else if (ext) { + int tmpmaxseg = -1; + int defaultmss; + socklen_t len = sizeof(tmpmaxseg); + + if (listener->addr.ss_family == AF_INET) + defaultmss = default_tcp_maxseg; + else + defaultmss = default_tcp6_maxseg; + + getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &tmpmaxseg, &len); + if (tmpmaxseg != defaultmss && + setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, + &defaultmss, sizeof(defaultmss)) == -1) { + msg = "cannot set MSS"; + err |= ERR_WARN; + } } - /* XXX: No else, the way to get the default MSS will vary from system -* to system. -*/ #endif #if defined(TCP_USER_TIMEOUT) if (listener->tcp_ut) { -- 2.9.3
Re: [RFC][PATCHES] seamless reload
I had a problem testing it on Mac OS X, because cmsghdr is aligned to 4 bytes. I changed the CMSG_ALIGN(sizeof(struct cmsghdr)) call to CMSG_LEN(0) to fix it. On Wed, Apr 12, 2017 at 10:41 AM, Olivier Houchard wrote: > Yet another patch, on top of the previous ones. > This one tries to get the default value of TCP_MAXSEG by creating a > temporary > TCP socket, so that one can remove the "mss" entry from its configuration > file, > and reset the mss value for any transferred socket from the old process. > > Olivier >
Re: haproxy deleting domain socket on graceful reload if backlog overflows
This just hit us again on a different set of load balancers... if there's a listen socket overflow on a domain socket during graceful, haproxy completely deletes the domain socket and becomes inaccessible. On Tue, Feb 21, 2017 at 6:47 PM, James Brown wrote: > Under load, we're sometimes seeing a situation where HAProxy will > completely delete a bound unix domain socket after a reload. > > The "bad flow" looks something like the following: > > >- haproxy is running on pid A, bound to /var/run/domain.sock (via a >bind line in a frontend) >- we run `haproxy -sf A`, which starts a new haproxy on pid B >- pid B binds to /var/run/domain.sock.B >- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in >uxst_bind_listener) >- in the mean time, there are a zillion connections to >/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted >- pid B signals pid A to shut down >- pid A runs the destroy_uxst_socket function and tries to connect to >/var/run/domain.sock to see if it's still in use. The connection fails >(because the backlog is full). Pid A unlinks /var/run/domain.sock. >Everything is sad forever now. > > I'm thinking about just commenting out the call to destroy_uxst_socket > since this is all on a tmpfs and we don't really care if spare sockets are > leaked when/if we change configuration in the future. Arguably, the > solution should be something where we don't overflow the listen socket at > all; I'm thinking about also binding to a TCP port on localhost and just > using that for the few seconds it takes to reload (since otherwise we run > out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to > unlink the socket, though. > > This has proven extremely irritating to reproduce (since it only occurs if > there's enough load to fill up the backlog on the socket between when pid B > starts up and when pid A shuts down), but I'm pretty confident that what I > described above is happening, since periodically on reloads the domain > socket isn't there and this code fits. > > Our configs are quite large, so I'm not reproducing them here. The reason > we bind on a domain socket at all is because we're running two sets of > haproxies — one in multi-process mode doing TCP-mode SSL termination > pointing back over a domain socket to a single-process haproxy applying all > of our actual config. > > -- > James Brown > Systems > Engineer > -- James Brown Engineer
Re: haproxy deleting domain socket on graceful reload if backlog overflows
HI James When you do a graceful reload of haproxy this is what happens. 1. the old process will accept no more connections and the stats page is stopped and so is the socket 2. a new haproxy instance is started where new clients get connected to, and this has the live socket 3. when the old haproxy instance has no more clients left it dies silently leaving all the clients on the new haproxy instance. This is expected behavior as you want the first haproxy to die when the last client leaves. Regards Andrew Smalley Loadbalancer.org Ltd. On 12 April 2017 at 19:32, James Brown wrote: > This just hit us again on a different set of load balancers... if there's > a listen socket overflow on a domain socket during graceful, haproxy > completely deletes the domain socket and becomes inaccessible. > > On Tue, Feb 21, 2017 at 6:47 PM, James Brown wrote: > >> Under load, we're sometimes seeing a situation where HAProxy will >> completely delete a bound unix domain socket after a reload. >> >> The "bad flow" looks something like the following: >> >> >>- haproxy is running on pid A, bound to /var/run/domain.sock (via a >>bind line in a frontend) >>- we run `haproxy -sf A`, which starts a new haproxy on pid B >>- pid B binds to /var/run/domain.sock.B >>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in >>uxst_bind_listener) >>- in the mean time, there are a zillion connections to >>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted >>- pid B signals pid A to shut down >>- pid A runs the destroy_uxst_socket function and tries to connect to >>/var/run/domain.sock to see if it's still in use. The connection fails >>(because the backlog is full). Pid A unlinks /var/run/domain.sock. >>Everything is sad forever now. >> >> I'm thinking about just commenting out the call to destroy_uxst_socket >> since this is all on a tmpfs and we don't really care if spare sockets are >> leaked when/if we change configuration in the future. Arguably, the >> solution should be something where we don't overflow the listen socket at >> all; I'm thinking about also binding to a TCP port on localhost and just >> using that for the few seconds it takes to reload (since otherwise we run >> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to >> unlink the socket, though. >> >> This has proven extremely irritating to reproduce (since it only occurs >> if there's enough load to fill up the backlog on the socket between when >> pid B starts up and when pid A shuts down), but I'm pretty confident that >> what I described above is happening, since periodically on reloads the >> domain socket isn't there and this code fits. >> >> Our configs are quite large, so I'm not reproducing them here. The reason >> we bind on a domain socket at all is because we're running two sets of >> haproxies — one in multi-process mode doing TCP-mode SSL termination >> pointing back over a domain socket to a single-process haproxy applying all >> of our actual config. >> >> -- >> James Brown >> Systems >> Engineer >> > > > > -- > James Brown > Engineer >
Re: ModSecurity: First integration patches
Hi. Am 12-04-2017 10:08, schrieb Thierry Fournier: On 12 Apr 2017, at 09:57, Aleksandar Lazic wrote: Am 11-04-2017 10:49, schrieb Thierry Fournier: Hi list I join one usage of HAProxy / SPOE, it is WAF offloading. These patches are a first version, it have some limitations describe in the README file in the directory contrib/modsecurity. - Christopher, please check the patch "BUG/MINOR", it is about spoe functions. - The exemple of ModSecurity compilation can be improved. It is based on my local distro. The feedback are welcome. I agree with Olivier that's really great stuff ;-) thanks. To which branch (I assume master) can I apply this patch? You’re right. The core haproxy patch is very light, I guess that the patch will be applied on any branch from 1.7.0 (needs SPOE) I will then create a docker image for testing ;-) Great ! Do you have the patches as files where I can download it? It's easier for docker to call a 'curl -vLO ...' then to go across a mail body ;-) Thierry Aleks
Re: ModSecurity: First integration patches
On Wed, 12 Apr 2017 21:21:58 +0200 Aleksandar Lazic wrote: > Hi. > > Am 12-04-2017 10:08, schrieb Thierry Fournier: > >> On 12 Apr 2017, at 09:57, Aleksandar Lazic wrote: > >> > >> > >> > >> Am 11-04-2017 10:49, schrieb Thierry Fournier: > >>> Hi list > >>> I join one usage of HAProxy / SPOE, it is WAF offloading. > >>> These patches are a first version, it have some limitations describe > >>> in the README file in the directory contrib/modsecurity. > >>> - Christopher, please check the patch "BUG/MINOR", it is about spoe > >>> functions. > >>> - The exemple of ModSecurity compilation can be improved. It is based > >>> on my local distro. > >>> The feedback are welcome. > >> > >> I agree with Olivier that's really great stuff ;-) > > > > > > thanks. > > > > > >> To which branch (I assume master) can I apply this patch? > > > > > > You’re right. The core haproxy patch is very light, I guess that the > > patch will be applied on any branch from 1.7.0 (needs SPOE) > > > > > >> I will then create a docker image for testing ;-) > > > > > > Great ! > > Do you have the patches as files where I can download it? > It's easier for docker to call a 'curl -vLO ...' then to go across a > mail body ;-) Not sure to understand. I given the patches as file. Note that I'm testing new email client. So I put the patches here: http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch Thierry > >>> Thierry > >> > >> Aleks >
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 11:19:37AM -0700, Steven Davidovitz wrote: > I had a problem testing it on Mac OS X, because cmsghdr is aligned to 4 > bytes. I changed the CMSG_ALIGN(sizeof(struct cmsghdr)) call to CMSG_LEN(0) > to fix it. > Oh right, I'll change that. Thanks a lot ! Olivier
Lua memory allocator
Thierry, while instrumenting my malloc/free functions to debug a problem, I was hit by a malloc/realloc inconsistency in the Lua allocator. The problem is that luaL_newstate() uses malloc() to create its first objects and only after this one we change the allocator to use ours. Thus on the first realloc() call it fails because it tries to reallocate using one function something allocated differently earlier. I found some info regarding the fact that instead it was possible to use lua_newstate() and pass it the allocator to use. So that's what I did and it solve the problem for me. I think that it results in more accurately accounting the memory in use by the Lua stack since even the very first malloc is counted now. But could you take a quick look to ensure I didn't overlook anything ? I'm attaching the patch, if you're fine with it I can merge it. Thanks, Willy >From a1ee4ff02e0755d9c8237615fe5ca7124c70c1ad Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Wed, 12 Apr 2017 21:40:29 +0200 Subject: MINOR: lua: ensure the memory allocator is used all the time luaL_setstate() uses malloc() to initialize the first objects, and only after this we replace the allocator. This creates trouble when replacing the standard memory allocators during debugging sessions since the new allocator is used to realloc() an area previously allocated using the default malloc(). Lua provides lua_newstate() in addition to luaL_newstate(), which takes an allocator for the initial malloc. This is exactly what we need, and this patch does this and fixes the problem. This has no impact outside of debugging sessions and there's no need to backport this. --- src/hlua.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/hlua.c b/src/hlua.c index 5383fe9..9b59194 100644 --- a/src/hlua.c +++ b/src/hlua.c @@ -7182,7 +7182,7 @@ void hlua_init(void) gL.Mref = LUA_REFNIL; gL.flags = 0; LIST_INIT(&gL.com); - gL.T = luaL_newstate(); + gL.T = lua_newstate(hlua_alloc, &hlua_global_allocator); hlua_sethlua(&gL); gL.Tref = LUA_REFNIL; gL.task = NULL; -- 1.7.12.1
Re: ModSecurity: First integration patches
Am 12-04-2017 21:28, schrieb thierry.fourn...@arpalert.org: On Wed, 12 Apr 2017 21:21:58 +0200 Aleksandar Lazic wrote: Hi. Am 12-04-2017 10:08, schrieb Thierry Fournier: >> On 12 Apr 2017, at 09:57, Aleksandar Lazic wrote: >> >> >> >> Am 11-04-2017 10:49, schrieb Thierry Fournier: >>> Hi list >>> I join one usage of HAProxy / SPOE, it is WAF offloading. >>> These patches are a first version, it have some limitations describe >>> in the README file in the directory contrib/modsecurity. >>> - Christopher, please check the patch "BUG/MINOR", it is about spoe >>> functions. >>> - The exemple of ModSecurity compilation can be improved. It is based >>> on my local distro. >>> The feedback are welcome. >> >> I agree with Olivier that's really great stuff ;-) > > > thanks. > > >> To which branch (I assume master) can I apply this patch? > > > You’re right. The core haproxy patch is very light, I guess that the > patch will be applied on any branch from 1.7.0 (needs SPOE) > > >> I will then create a docker image for testing ;-) > > > Great ! Do you have the patches as files where I can download it? It's easier for docker to call a 'curl -vLO ...' then to go across a mail body ;-) Not sure to understand. I given the patches as file. Note that I'm testing new email client. So I put the patches here: http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch I'm so sorry for the rush. :-( I have seen to late that you have send the patches to the list. Thanks for the links. I will take more care in the future. Thierry Aleks
Re: haproxy deleting domain socket on graceful reload if backlog overflows
Hi Andrew: Thanks for you feedback, but I'm describing a very specific bug wherein the old haproxy will unlink the new haproxy's bound unix domain socket upon reload due to a race condition in the domain socket cleanup code if a listen overflow occurs while the graceful is in process. On Wed, Apr 12, 2017 at 11:39 AM, Andrew Smalley wrote: > HI James > > When you do a graceful reload of haproxy this is what happens. > > 1. the old process will accept no more connections and the stats page is > stopped and so is the socket > 2. a new haproxy instance is started where new clients get connected to, > and this has the live socket > 3. when the old haproxy instance has no more clients left it dies silently > leaving all the clients on the new haproxy instance. > > This is expected behavior as you want the first haproxy to die when the > last client leaves. > > > Regards > > Andrew Smalley > > Loadbalancer.org Ltd. > > > > On 12 April 2017 at 19:32, James Brown wrote: > >> This just hit us again on a different set of load balancers... if there's >> a listen socket overflow on a domain socket during graceful, haproxy >> completely deletes the domain socket and becomes inaccessible. >> >> On Tue, Feb 21, 2017 at 6:47 PM, James Brown wrote: >> >>> Under load, we're sometimes seeing a situation where HAProxy will >>> completely delete a bound unix domain socket after a reload. >>> >>> The "bad flow" looks something like the following: >>> >>> >>>- haproxy is running on pid A, bound to /var/run/domain.sock (via a >>>bind line in a frontend) >>>- we run `haproxy -sf A`, which starts a new haproxy on pid B >>>- pid B binds to /var/run/domain.sock.B >>>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in >>>uxst_bind_listener) >>>- in the mean time, there are a zillion connections to >>>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted >>>- pid B signals pid A to shut down >>>- pid A runs the destroy_uxst_socket function and tries to connect >>>to /var/run/domain.sock to see if it's still in use. The connection fails >>>(because the backlog is full). Pid A unlinks /var/run/domain.sock. >>>Everything is sad forever now. >>> >>> I'm thinking about just commenting out the call to destroy_uxst_socket >>> since this is all on a tmpfs and we don't really care if spare sockets are >>> leaked when/if we change configuration in the future. Arguably, the >>> solution should be something where we don't overflow the listen socket at >>> all; I'm thinking about also binding to a TCP port on localhost and just >>> using that for the few seconds it takes to reload (since otherwise we run >>> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to >>> unlink the socket, though. >>> >>> This has proven extremely irritating to reproduce (since it only occurs >>> if there's enough load to fill up the backlog on the socket between when >>> pid B starts up and when pid A shuts down), but I'm pretty confident that >>> what I described above is happening, since periodically on reloads the >>> domain socket isn't there and this code fits. >>> >>> Our configs are quite large, so I'm not reproducing them here. The >>> reason we bind on a domain socket at all is because we're running two sets >>> of haproxies — one in multi-process mode doing TCP-mode SSL termination >>> pointing back over a domain socket to a single-process haproxy applying all >>> of our actual config. >>> >>> -- >>> James Brown >>> Systems >>> Engineer >>> >> >> >> >> -- >> James Brown >> Engineer >> > > -- James Brown Engineer
Re: haproxy deleting domain socket on graceful reload if backlog overflows
HI James Thank you for your reply. I do not see how the old haproxy being on a separate PID could do anything with a socket created by a new PID. Do you bring up your new instance with real servers in a maintenance state? this seems to be required to do a correct handover before making them live and active/ready to handle connections. Also there is a SYN_BLOCK firewall rule required during the reload? I ask because we have had no reports of such a race condition. Regards Andrew Smalley Loadbalancer.org Ltd. On 12 April 2017 at 23:34, James Brown wrote: > Hi Andrew: > > Thanks for you feedback, but I'm describing a very specific bug wherein > the old haproxy will unlink the new haproxy's bound unix domain socket upon > reload due to a race condition in the domain socket cleanup code if a > listen overflow occurs while the graceful is in process. > > On Wed, Apr 12, 2017 at 11:39 AM, Andrew Smalley < > asmal...@loadbalancer.org> wrote: > >> HI James >> >> When you do a graceful reload of haproxy this is what happens. >> >> 1. the old process will accept no more connections and the stats page is >> stopped and so is the socket >> 2. a new haproxy instance is started where new clients get connected to, >> and this has the live socket >> 3. when the old haproxy instance has no more clients left it dies >> silently leaving all the clients on the new haproxy instance. >> >> This is expected behavior as you want the first haproxy to die when the >> last client leaves. >> >> >> Regards >> >> Andrew Smalley >> >> Loadbalancer.org Ltd. >> >> >> >> On 12 April 2017 at 19:32, James Brown wrote: >> >>> This just hit us again on a different set of load balancers... if >>> there's a listen socket overflow on a domain socket during graceful, >>> haproxy completely deletes the domain socket and becomes inaccessible. >>> >>> On Tue, Feb 21, 2017 at 6:47 PM, James Brown >>> wrote: >>> Under load, we're sometimes seeing a situation where HAProxy will completely delete a bound unix domain socket after a reload. The "bad flow" looks something like the following: - haproxy is running on pid A, bound to /var/run/domain.sock (via a bind line in a frontend) - we run `haproxy -sf A`, which starts a new haproxy on pid B - pid B binds to /var/run/domain.sock.B - pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in uxst_bind_listener) - in the mean time, there are a zillion connections to /var/run/domain.sock and pid B isn't started up yet; backlog is exhausted - pid B signals pid A to shut down - pid A runs the destroy_uxst_socket function and tries to connect to /var/run/domain.sock to see if it's still in use. The connection fails (because the backlog is full). Pid A unlinks /var/run/domain.sock. Everything is sad forever now. I'm thinking about just commenting out the call to destroy_uxst_socket since this is all on a tmpfs and we don't really care if spare sockets are leaked when/if we change configuration in the future. Arguably, the solution should be something where we don't overflow the listen socket at all; I'm thinking about also binding to a TCP port on localhost and just using that for the few seconds it takes to reload (since otherwise we run out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to unlink the socket, though. This has proven extremely irritating to reproduce (since it only occurs if there's enough load to fill up the backlog on the socket between when pid B starts up and when pid A shuts down), but I'm pretty confident that what I described above is happening, since periodically on reloads the domain socket isn't there and this code fits. Our configs are quite large, so I'm not reproducing them here. The reason we bind on a domain socket at all is because we're running two sets of haproxies — one in multi-process mode doing TCP-mode SSL termination pointing back over a domain socket to a single-process haproxy applying all of our actual config. -- James Brown Systems Engineer >>> >>> >>> >>> -- >>> James Brown >>> Engineer >>> >> >> > > > -- > James Brown > Engineer >
Re: ModSecurity: First integration patches
Am 12-04-2017 23:33, schrieb Aleksandar Lazic: Am 12-04-2017 21:28, schrieb thierry.fourn...@arpalert.org: On Wed, 12 Apr 2017 21:21:58 +0200 Aleksandar Lazic wrote: [snipp] Do you have the patches as files where I can download it? It's easier for docker to call a 'curl -vLO ...' then to go across a mail body ;-) Not sure to understand. I given the patches as file. Note that I'm testing new email client. So I put the patches here: http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch I'm so sorry for the rush. :-( I have seen to late that you have send the patches to the list. Thanks for the links. I will take more care in the future. I have now build the haproxy with modsecurity on centos 7.3 ;-) I have used this file for modsecurity. https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0/master/crs-setup.conf.example ### /usr/local/bin/modsecurity -f crs-setup.conf.example 1492041223.145110 [00] ModSecurity for nginx (STABLE)/2.9.1 (http://www.modsecurity.org/) configured. 1492041223.145159 [00] ModSecurity: APR compiled version="1.4.8"; loaded version="1.4.8" 1492041223.145193 [00] ModSecurity: PCRE compiled version="8.32 "; loaded version="8.32 2012-11-30" 1492041223.145197 [00] ModSecurity: LIBXML compiled version="2.9.1" 1492041223.145200 [00] ModSecurity: Status engine is currently disabled, enable it by set SecStatusEngine to On. 1492041228.152877 [01] 0 clients connected 1492041228.153037 [02] 0 clients connected 1492041228.153069 [03] 0 clients connected ... ### It was a little bit challenging. .) the patches apply only on haproxy 1.8 because some files does not exists on 1.7 ( e. g. include/proto/spoe.h ) git clone http://git.haproxy.org/git/haproxy.git/ patch -d haproxy -p 1 -i /usr/src/0001-BUG-MINOR-change-header-declared-function-to-static-.patch patch -d haproxy -p 1 -i /usr/src/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch patch -d haproxy -p 1 -i /usr/src/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch .) you will need a lot of devel packages inclusive some httpd one. yum install -y apr-devel apr-util-devel gcc make libevent-devel libxml2-devel libcurl-devel httpd-devel pcre-devel yajl-devel .) I will figure out which runtime packages will be necessary. .) I have started a Dockerfile which you can find at github. https://github.com/git001/haproxy-waf/blob/master/Dockerfile Open questions for me. .) How is the transfer-encoding handled (a. k. a. streaming)? .) How big can a content be? Where can we define some limits? .) How can the rule-set be reloaded? stop & start || gracefully? Again thanks Thierry for your work this looks very good. Regards Aleks
Re: haproxy deleting domain socket on graceful reload if backlog overflows
On Apr 12, 2017 6:49 PM, "Andrew Smalley" wrote: HI James Thank you for your reply. I do not see how the old haproxy being on a separate PID could do anything with a socket created by a new PID. How? Easily. Unix domain sockets are presented as files. *Any* process can unlink a domain socket right out from under you, just like any file, even if you have an open handle... domain sockets aren't owned by a pid. This report, prima facie, seems entirely plausible.
Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)
Thanks Bryan, The problem I'm having is isolated to the first one second of the connection not the end Here is a summary of the tcp traffic. Hopefully it makes the example more clear. *client connects to haproxy: (all good)* 38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq 2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr 0,nop,wscale 7], length 0 38.057156 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [S.], seq 3284611992, ack 2113072543, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr 82055529,nop,wscale 7], length 0 38.057178 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [.], ack 1, win 342, options [nop,nop,TS val 82055529 ecr 82055529], length 0 *haproxy starts connecting to server (SYN) (good)* 38.057295 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [S], seq 35567, win 29200, options [mss 1460,sackOK,TS val 82055529 ecr 0,nop,wscale 7], length 0 *client sends 198 bytes to initiate SSL connection and we have an ACK (good)* 38.060539 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [P.], seq 1:199, ack 1, win 342, options [nop,nop,TS val 82055530 ecr 82055529], length 198 38.060598 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 199, win 350, options [nop,nop,TS val 82055530 ecr 82055530], length 0 *haproxy finishes connecting to the server (SYNACK/ACK) (good)* 38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq 4125907118, ack 35568, win 28960, options [mss 1460,sackOK,TS val 662461622 ecr 82055529,nop,wscale 8], length 0 38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1, win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0 *And now there is nothing for 5 seconds when we should have seen a data packet from haproxy to the server with the "length 198" payload. * *It appears that haproxy never tried to send the data!?!? * *The server then disconnects 5 seconds into the transaction since it got no data. (that is the way the server is suppose to behave)* 43.183207 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [F.], seq 1, ack 1, win 114, options [nop,nop,TS val 662466683 ecr 82055545], length 0 On Mon, Apr 10, 2017 at 7:58 PM, Bryan Talbot wrote: > > On Apr 8, 2017, at Apr 8, 2:24 PM, Lincoln Stern > wrote: > > I'm not sure how to interpret this, but it appears that haproxy is dropping > client payload intermittently (1/100). I have included tcpdumps and logs > to > show what is happening. > > Am I doing something wrong? I have no idea what could be causing this or > how > to go about debugging it. I cannot reproduce it, but I do observe in > production ~2 times > a day across 20 instances and 2K connections. > > Any help or advice would be greatly appreciated. > > > > > You’re in TCP mode with 60 second timeouts. So, if the connection is idle > for that long then the proxy will disconnect. If you need idle connections > to stick around longer and mix http and tcp traffic then you probably want > to set “timeout tunnel” to however long you’re willing to let idle tcp > connections sit around and not impact http timeouts. If you only need > long-lived tcp “tunnel” connections, then you can instead just increase > both your “timeout client” and “timeout server” timeouts to cover your > requirements. > > -Bryan > > > > What I'm trying to accomplish is to provide HA availability over two routes > (i.e. internet providers). One acts as primary and I gave it a "static-rr" > "weight" of 256 and the other as backup and has a weight of "1". Backup > should only be used in case of primary failure. > > > log: > Apr 4 18:55:27 app055 haproxy[13666]: 127.0.0.1:42262 > [04/Apr/2017:18:54:41.585] ws-local servers/server1 1/86/45978 4503 5873 -- > 0/0/0/0/0 0/0 > Apr 4 22:46:37 app055 haproxy[13666]: 127.0.0.1:47130 > [04/Apr/2017:22:46:36.931] ws-local servers/server1 1/62/663 7979 517 -- > 0/0/0/0/0 0/0 > Apr 4 22:46:38 app055 haproxy[13666]: 127.0.0.1:32931 > [04/Apr/2017:22:46:37.698] ws-local servers/server1 1/55/405 3062 553 -- > 1/1/1/1/0 0/0 > Apr 4 22:46:43 app055 haproxy[13666]: 127.0.0.1:41748 > [04/Apr/2017:22:46:43.190] ws-local servers/server1 1/115/452 7979 517 -- > 2/2/2/2/0 0/0 > Apr 4 22:46:46 app055 haproxy[13666]: 127.0.0.1:57226 > [04/Apr/2017:22:46:43.576] ws-local servers/server1 1/76/3066 2921 538 -- > 1/1/1/1/0 0/0 > Apr 4 22:46:47 app055 haproxy[13666]: 127.0.0.1:39656 > [04/Apr/2017:22:46:47.072] ws-local servers/server1 1/67/460 8254 528 -- > 1/1/1/1/0 0/0 > Apr 4 22:47:38 app055 haproxy[13666]: 127.0.0.1:39888 > [04/Apr/2017:22:46:38.057] ws-local servers/server1 1/63/60001 0 0 cD > 0/0/0/0/0 0/0 > Apr 5 08:44:55 app055 haproxy[13666]: 127.0.0.1:42650 > [05/Apr/2017:08:44:05.529] ws-local servers/server1 1/53/49645 4364 4113 -- > 0/0/0/0/0 0/0 > > > tcpdump: > 22:46:38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq > 2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr > 0,nop,wscale 7], length 0 > 22:46:38.05715
Re: haproxy deleting domain socket on graceful reload if backlog overflows
Hi Andrew, James, On Wed, Apr 12, 2017 at 11:46:57PM +0100, Andrew Smalley wrote: > I do not see how the old haproxy being on a separate PID could do anything > with a socket created by a new PID. That's what James explained. The old process tries to clean up before leaving and tries to clean its own socket, but in practice it ends up cleaning the new socket if the connection attempt fails due to the new socket being saturated. That's done in destroy_uxst_socket(). > Also there is a SYN_BLOCK firewall rule required during the reload? I ask > because we have had no reports of such a race condition. That's totally irrelevant. In his case this is unix sockets, not TCP sockets. James, I agree with you that what is done in destroy_uxst_socket() is completely stupid. Even the comment saying "we might have been chrooted" indicates that in some cases we could end up removing someone else's socket (if permissions allow for it of course). Given that most of the time users will run chrooted anyway, these sockets are almost never cleaned up. So I'd vote for simply killing the contents of destroy_uxst_socket() until we find a better idea to ensure we're really seeing *this* process' socket. For example we could note the socket's inode upon startup and only remove the socket if it matches this stored inode. This is just an idea, of course. In the mean time you can work around the problem by applying just the one-line patch below : static void destroy_uxst_socket(const char *path) { struct sockaddr_un addr; int sock, ret; + return; /* if the path was cleared, we do nothing */ if (!*path) return; Thanks for your detailed analysis. Would you be willing to provide a patch ? You can reuse your original e-mail as the commit message, it's fully detailed and explains the situation very well. Thanks! Willy
Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)
Hi Lincoln, On Wed, Apr 12, 2017 at 08:24:41PM -0400, Lincoln Stern wrote: (...) > *haproxy finishes connecting to the server (SYNACK/ACK) (good)* > 38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq > 4125907118, ack 35568, win 28960, options [mss 1460,sackOK,TS val > 662461622 ecr 82055529,nop,wscale 8], length 0 > 38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1, > win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0 > > *And now there is nothing for 5 seconds when we should have seen a data > packet from haproxy to the server with the "length 198" payload. * > > *It appears that haproxy never tried to send the data!?!? * Thanks for this capture! This is a regression introduced in 1.7.3 while trying to fix a bug in 1.7.2, it was fixed later by this commit : commit 51f2336ef7738c945ccedf5f1179825ac401862c Author: Willy Tarreau Date: Sat Mar 18 17:11:37 2017 +0100 BUG/MAJOR: stream-int: do not depend on connection flags to detect connectio (...) Please upgrade to 1.7.5 and it will be OK. Regards, Willy
Re: [RFC][PATCHES] seamless reload
On Wed, Apr 12, 2017 at 07:41:43PM +0200, Olivier Houchard wrote: > + if (default_tcp_maxseg == -1) { > + default_tcp_maxseg = -2; > + fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); > + if (fd < 0) > + Warning("Failed to create a temporary socket!\n"); > + else { > + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, > &default_tcp_maxseg, > + &ready_len) == -1) > + Warning("Failed to get the default value of > TCP_MAXSEG\n"); Olivier, you're missing a close(fd) here, it'll leak this fd. > + } > + } > + if (default_tcp6_maxseg == -1) { > + default_tcp6_maxseg = -2; > + fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP); > + if (fd >= 0) { > + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, > &default_tcp6_maxseg, > + &ready_len) == -1) > + Warning("Failed ot get the default value of > TCP_MAXSEG for IPv6\n"); > + close(fd); > + } > + } > +#endif > + > (...) > + getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &tmpmaxseg, &len); > + if (tmpmaxseg != defaultmss && > + setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, > + &defaultmss, sizeof(defaultmss)) == -1) { Please fix the alignment for the argument here, it's a bit confusing :-) Otherwise looks good. I think it was a good idea to create temporary sockets to retrieve some default settings. That may be something we could generalize to guess other parameters or capabilities if needed in the future. For example we could use this to detect whether or not IPv6 is supported and emit errors only once instead of every bind line. Another use case is to always know the MSS applied to a listening socket in order to automatically adjust the SSL maxrecord instead of assuming 1460 by default. Over time we might find other use cases. Cheers, Willy