date:20170412

Re: ModSecurity: First integration patches

2017-04-12 Thread Aleksandar Lazic




Am 11-04-2017 10:49, schrieb Thierry Fournier:

Hi list

I join one usage of HAProxy / SPOE, it is WAF offloading.

These patches are a first version, it have some limitations describe
in the README file in the directory contrib/modsecurity.

 - Christopher, please check the patch "BUG/MINOR", it is about spoe
   functions.

 - The exemple of ModSecurity compilation can be improved. It is based
   on my local distro.

The feedback are welcome.


I agree with Olivier that's really great stuff ;-)

To which branch (I assume master) can I apply this patch?

I will then create a docker image for testing ;-)


Thierry


Aleks

Re: ModSecurity: First integration patches

2017-04-12 Thread Thierry Fournier


> On 12 Apr 2017, at 09:57, Aleksandar Lazic  wrote:
> 
> 
> 
> Am 11-04-2017 10:49, schrieb Thierry Fournier:
>> Hi list
>> I join one usage of HAProxy / SPOE, it is WAF offloading.
>> These patches are a first version, it have some limitations describe
>> in the README file in the directory contrib/modsecurity.
>> - Christopher, please check the patch "BUG/MINOR", it is about spoe
>>   functions.
>> - The exemple of ModSecurity compilation can be improved. It is based
>>   on my local distro.
>> The feedback are welcome.
> 
> I agree with Olivier that's really great stuff ;-)


thanks.


> To which branch (I assume master) can I apply this patch?


You’re right. The core haproxy patch is very light, I guess that the patch will 
be applied on any branch from 1.7.0 (needs SPOE)


> I will then create a docker image for testing ;-)


Great !


>> Thierry
> 
> Aleks

Re: ModSecurity: First integration patches

2017-04-12 Thread Christopher Faulet


Le 11/04/2017 à 10:49, Thierry Fournier a écrit :

Hi list

I join one usage of HAProxy / SPOE, it is WAF offloading.

These patches are a first version, it have some limitations describe
in the README file in the directory contrib/modsecurity.

 - Christopher, please check the patch "BUG/MINOR", it is about spoe
   functions.

 - The exemple of ModSecurity compilation can be improved. It is based
   on my local distro.

The feedback are welcome.



Hi Thierry,

Really nice ! I'll take a look at it soon. Glad to see the first service 
that uses the SPOE ! Good job.


--
Christopher Faulet

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann

Hi Olivier,

I was very eager to try out your patch set, thanks a lot! However, after
applying all of them (including the last three), it seems that using
environment variables in the config is broken (i.e. ${VARNAME} does not get
replaced with the value of the environment variable anymore). I am using
the systemd-wrapper, but not systemd.

Just from looking at the patches I couldn't make out what exactly might be
the problem. I guess it could be either the wrapper not passing on its
environment properly or haproxy not using it properly anymore. If you have
any immediate ideas, let me know. I'll try to fully debug this when I can
spare the time.

Thanks a lot,
Conrad

On 04/10/2017 08:09 PM, Olivier Houchard wrote:
> 
> Hi,
> 
> On top of those patches, here a 3 more patches.
> The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET
> environment variable. If set, it will use that as an argument to -x, when
> reloading the process.
> The second one sends listening unix sockets, as well as IPv4/v6 sockets. 
> I see no reason not to, and that means we no longer have to wait until
> the old process close the socket before being able to accept new connections
> on it.
> The third one adds a new global optoin, nosockettransfer, if set, we assume
> we will never try to transfer listening sockets through the stats socket,
> and close any socket nout bound to our process, to save a few file
> descriptors.
> 
> Regards,
> 
> Olivier
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
> Hi Olivier,
> 
> I was very eager to try out your patch set, thanks a lot! However, after
> applying all of them (including the last three), it seems that using
> environment variables in the config is broken (i.e. ${VARNAME} does not get
> replaced with the value of the environment variable anymore). I am using
> the systemd-wrapper, but not systemd.
> 
> Just from looking at the patches I couldn't make out what exactly might be
> the problem. I guess it could be either the wrapper not passing on its
> environment properly or haproxy not using it properly anymore. If you have
> any immediate ideas, let me know. I'll try to fully debug this when I can
> spare the time.
> 
> Thanks a lot,
> Conrad
> 

Hi Conrad,

You're right, that was just me being an idiot :)
Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
with the one attached, or just replace in src/haproxy-systemd-wrapper.c :

argv = calloc(4 + main_argc + nb_pid + 1 +
stats_socket != NULL ? 2 : 0, sizeof(char *));
by :
argv = calloc(4 + main_argc + nb_pid + 1 +
(stats_socket != NULL ? 2 : 0), sizeof(char *));

Regards,

Olivier
>From 526dca943b9cc89732c54bc43a6ce36e17b67890 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Sun, 9 Apr 2017 16:28:10 +0200
Subject: [PATCH 7/9] MINOR: systemd wrapper: add support for passing the -x
 option.

Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set.
If set, it will use it as an argument to the "-x" option, which makes
haproxy asks for any listening socket, on the stats socket, in order
to achieve reloads with no new connection lost.
---
 contrib/systemd/haproxy.service.in |  2 ++
 src/haproxy-systemd-wrapper.c  | 10 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/contrib/systemd/haproxy.service.in 
b/contrib/systemd/haproxy.service.in
index dca81a2..05bb716 100644
--- a/contrib/systemd/haproxy.service.in
+++ b/contrib/systemd/haproxy.service.in
@@ -3,6 +3,8 @@ Description=HAProxy Load Balancer
 After=network.target
 
 [Service]
+# You can point the environment variable HAPROXY_STATS_SOCKET to a stats
+# socket if you want seamless reloads.
 Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
 ExecStartPre=@SBINDIR@/haproxy -f $CONFIG -c -q
 ExecStart=@SBINDIR@/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE
diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index f6a9c85..457f5bd 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -92,11 +92,15 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
pid = fork();
if (!pid) {
char **argv;
+   char *stats_socket = NULL;
int i;
int argno = 0;
 
/* 3 for "haproxy -Ds -sf" */
-   argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *));
+   if (nb_pid > 0)
+   stats_socket = getenv("HAPROXY_STATS_SOCKET");
+   argv = calloc(4 + main_argc + nb_pid + 1 +
+   (stats_socket != NULL ? 2 : 0), sizeof(char *));
if (!argv) {
fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: 
failed to calloc(), please try again later.\n");
exit(1);
@@ -121,6 +125,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
argv[argno++] = "-sf";
for (i = 0; i < nb_pid; ++i)
argv[argno++] = pid_strv[i];
+   if (stats_socket != NULL) {
+   argv[argno++] = "-x";
+   argv[argno++] = stats_socket;
+   }
}
argv[argno] = NULL;
 
-- 
2.9.3

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann

On 04/12/2017 03:37 PM, Olivier Houchard wrote:
> On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
>> Hi Olivier,
>>
>> I was very eager to try out your patch set, thanks a lot! However, after
>> applying all of them (including the last three), it seems that using
>> environment variables in the config is broken (i.e. ${VARNAME} does not get
>> replaced with the value of the environment variable anymore). I am using
>> the systemd-wrapper, but not systemd.
>>
>> Just from looking at the patches I couldn't make out what exactly might be
>> the problem. I guess it could be either the wrapper not passing on its
>> environment properly or haproxy not using it properly anymore. If you have
>> any immediate ideas, let me know. I'll try to fully debug this when I can
>> spare the time.
>>
>> Thanks a lot,
>> Conrad
>>
> 
> Hi Conrad,
> 
> You're right, that was just me being an idiot :)
> Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
> with the one attached, or just replace in src/haproxy-systemd-wrapper.c :
> 
> argv = calloc(4 + main_argc + nb_pid + 1 +
>   stats_socket != NULL ? 2 : 0, sizeof(char *));
> by :
> argv = calloc(4 + main_argc + nb_pid + 1 +
>   (stats_socket != NULL ? 2 : 0), sizeof(char *));
> 
> Regards,
> 
> Olivier


Works indeed, hard to spot :)

Thanks a lot, I'll now get back to testing the actual reloads!

Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann

Hi again,

so I tried to get this to work, but didn't manage yet. I also don't quite
understand how this is supposed to work. The first haproxy process is
started _without_ the -x option, is that correct? Where does that instance
ever create the socket for transfer to later instances?

I have it working now insofar that on reload, subsequent instances are
spawned with the -x option, but they'll just complain that they can't get
anything from the unix socket (because, for all I can tell, it's not
there?). I also can't see the relevant code path where this socket gets
created, but I didn't have time to read all of it yet.

Am I doing something wrong? Did anyone get this to work with the
systemd-wrapper so far?

Also, but this might be a coincidence, my test setup takes a huge
performance penalty just by applying your patches (without any reloading
whatsoever). Did this happen to anybody else? I'll send some numbers and
more details tomorrow.

Thanks a lot,
Conrad

On 04/12/2017 03:47 PM, Conrad Hoffmann wrote:
> On 04/12/2017 03:37 PM, Olivier Houchard wrote:
>> On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
>>> Hi Olivier,
>>>
>>> I was very eager to try out your patch set, thanks a lot! However, after
>>> applying all of them (including the last three), it seems that using
>>> environment variables in the config is broken (i.e. ${VARNAME} does not get
>>> replaced with the value of the environment variable anymore). I am using
>>> the systemd-wrapper, but not systemd.
>>>
>>> Just from looking at the patches I couldn't make out what exactly might be
>>> the problem. I guess it could be either the wrapper not passing on its
>>> environment properly or haproxy not using it properly anymore. If you have
>>> any immediate ideas, let me know. I'll try to fully debug this when I can
>>> spare the time.
>>>
>>> Thanks a lot,
>>> Conrad
>>>
>>
>> Hi Conrad,
>>
>> You're right, that was just me being an idiot :)
>> Please replace 
>> 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
>> with the one attached, or just replace in src/haproxy-systemd-wrapper.c :
>>
>> argv = calloc(4 + main_argc + nb_pid + 1 +
>>  stats_socket != NULL ? 2 : 0, sizeof(char *));
>> by :
>> argv = calloc(4 + main_argc + nb_pid + 1 +
>>  (stats_socket != NULL ? 2 : 0), sizeof(char *));
>>
>> Regards,
>>
>> Olivier
> 
> 
> Works indeed, hard to spot :)
> 
> Thanks a lot, I'll now get back to testing the actual reloads!
> 
> Conrad
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> Hi again,
> 
> so I tried to get this to work, but didn't manage yet. I also don't quite
> understand how this is supposed to work. The first haproxy process is
> started _without_ the -x option, is that correct? Where does that instance
> ever create the socket for transfer to later instances?
> 
> I have it working now insofar that on reload, subsequent instances are
> spawned with the -x option, but they'll just complain that they can't get
> anything from the unix socket (because, for all I can tell, it's not
> there?). I also can't see the relevant code path where this socket gets
> created, but I didn't have time to read all of it yet.
> 
> Am I doing something wrong? Did anyone get this to work with the
> systemd-wrapper so far?
> 

You're right, the first one runs without -x. The socket used is just any
stats socket.

> Also, but this might be a coincidence, my test setup takes a huge
> performance penalty just by applying your patches (without any reloading
> whatsoever). Did this happen to anybody else? I'll send some numbers and
> more details tomorrow.
> 

Ah this is news to me, I did not expect a performances impact, can you share
your configuration file ?

Thanks !

Olivier

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> Hi again,
> 
> so I tried to get this to work, but didn't manage yet. I also don't quite
> understand how this is supposed to work. The first haproxy process is
> started _without_ the -x option, is that correct? Where does that instance
> ever create the socket for transfer to later instances?
> 
> I have it working now insofar that on reload, subsequent instances are
> spawned with the -x option, but they'll just complain that they can't get
> anything from the unix socket (because, for all I can tell, it's not
> there?). I also can't see the relevant code path where this socket gets
> created, but I didn't have time to read all of it yet.
> 
> Am I doing something wrong? Did anyone get this to work with the
> systemd-wrapper so far?
> 
> Also, but this might be a coincidence, my test setup takes a huge
> performance penalty just by applying your patches (without any reloading
> whatsoever). Did this happen to anybody else? I'll send some numbers and
> more details tomorrow.
> 

Ok I can confirm the performance issues, I'm investigating.

Olivier

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> > Hi again,
> > 
> > so I tried to get this to work, but didn't manage yet. I also don't quite
> > understand how this is supposed to work. The first haproxy process is
> > started _without_ the -x option, is that correct? Where does that instance
> > ever create the socket for transfer to later instances?
> > 
> > I have it working now insofar that on reload, subsequent instances are
> > spawned with the -x option, but they'll just complain that they can't get
> > anything from the unix socket (because, for all I can tell, it's not
> > there?). I also can't see the relevant code path where this socket gets
> > created, but I didn't have time to read all of it yet.
> > 
> > Am I doing something wrong? Did anyone get this to work with the
> > systemd-wrapper so far?
> > 
> > Also, but this might be a coincidence, my test setup takes a huge
> > performance penalty just by applying your patches (without any reloading
> > whatsoever). Did this happen to anybody else? I'll send some numbers and
> > more details tomorrow.
> > 
> 
> Ok I can confirm the performance issues, I'm investigating.
> 

Found it, I was messing with SO_LINGER when I shouldn't have been.

Can you try the new version of
0003-MINOR-tcp-When-binding-socket-attempt-to-reuse-one-f.patch
or replace, in src/proto_tcp.c :
if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 &&
(tmplinger.l_onoff == 0 || tmplinger.l_linger == 0)) {
/* Attempt to activate SO_LINGER, not sure what a sane
 * value is, using the default BSD value of 120s.
 */
tmplinger.l_onoff = 1;
tmplinger.l_linger = 120;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger,
sizeof(tmplinger));
}


by :
if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 &&
(tmplinger.l_onoff == 1 || tmplinger.l_linger == 0)) {
tmplinger.l_onoff = 0;
tmplinger.l_linger = 0;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger,
sizeof(tmplinger));
}
}


Thanks !

Olivier
>From 1bf2b9c550285b434c865adeb175277ce00c842b Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 5 Apr 2017 22:39:56 +0200
Subject: [PATCH 3/9] MINOR: tcp: When binding socket, attempt to reuse one
 from the old proc.

Try to reuse any socket from the old process, provided by the "-x" flag,
before binding a new one, assuming it is compatible.
"Compatible" here means same address and port, same namspace if any,
same interface if any, and that the following flags are the same :
LI_O_FOREIGN, LI_O_V6ONLY and LI_O_V4V6.
Also change tcp_bind_listener() to always enable/disable socket options,
instead of just doing so if it is in the configuration file, as the option
may have been removed, ie TCP_FASTOPEN may have been set in the old process,
and removed from the new configuration, so we have to disable it.
---
 src/proto_tcp.c | 119 +++-
 1 file changed, 117 insertions(+), 2 deletions(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index 5e12b99..ea6b8f7 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -731,6 +731,83 @@ int tcp_connect_probe(struct connection *conn)
return 0;
 }
 
+/* XXX: Should probably be elsewhere */
+static int compare_sockaddr(struct sockaddr_storage *a, struct 
sockaddr_storage *b)
+{
+   if (a->ss_family != b->ss_family) {
+   return (-1);
+   }
+   switch (a->ss_family) {
+   case AF_INET:
+   {
+   struct sockaddr_in *a4 = (void *)a, *b4 = (void *)b;
+   if (a4->sin_port != b4->sin_port)
+   return (-1);
+   return (memcmp(&a4->sin_addr, &b4->sin_addr,
+   sizeof(a4->sin_addr)));
+   }
+   case AF_INET6:
+   {
+   struct sockaddr_in6 *a6 = (void *)a, *b6 = (void *)b;
+   if (a6->sin6_port != b6->sin6_port)
+   return (-1);
+   return (memcmp(&a6->sin6_addr, &b6->sin6_addr,
+   sizeof(a6->sin6_addr)));
+   }
+   default:
+   return (-1);
+   }
+
+}
+
+#define LI_MANDATORY_FLAGS (LI_O_FOREIGN | LI_O_V6ONLY | LI_O_V4V6)
+/* When binding the listeners, check if a socket has been sent to us by the
+ * previous process that we could reuse, instead of creating a new one.
+ */
+static int tcp_find_compatible_fd(struct listener *l)
+{
+   struct xfer_sock_list *xfer_sock = xfer_sock_list;
+   int ret = -1;
+
+

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

Yet another patch, on top of the previous ones.
This one tries to get the default value of TCP_MAXSEG by creating a temporary 
TCP socket, so that one can remove the "mss" entry from its configuration file,
 and reset the mss value for any transferred socket from the old process.

Olivier
>From 7dc2432f3a7c4a9e9531adafa4524a199e394f90 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 12 Apr 2017 19:32:15 +0200
Subject: [PATCH 10/10] MINOR: tcp: Attempt to reset TCP_MAXSEG when reusing a
 socket.

Guess the default value for TCP_MAXSEG by binding a temporary TCP socket and
getting it, and use that in case the we're reusing a socket from the old
process, and its TCP_MAXSEG is different. That way one can reset the
TCP_MAXSEG value to the default one when reusing sockets.
---
 src/proto_tcp.c | 55 ---
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index ea6b8f7..8050f3e 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -110,6 +110,12 @@ static struct protocol proto_tcpv6 = {
.nb_listeners = 0,
 };
 
+/* Default TCP parameters, got by opening a temporary TCP socket. */
+#ifdef TCP_MAXSEG
+static int default_tcp_maxseg = -1;
+static int default_tcp6_maxseg = -1;
+#endif
+
 /* Binds ipv4/ipv6 address  to socket , unless  is set, in 
which
  * case we try to bind .  is a 2-bit field consisting of :
  *  - 0 : ignore remote address (may even be a NULL pointer)
@@ -829,6 +835,35 @@ int tcp_bind_listener(struct listener *listener, char 
*errmsg, int errlen)
int ext, ready;
socklen_t ready_len;
const char *msg = NULL;
+#ifdef TCP_MAXSEG
+
+   /* Create a temporary TCP socket to get default parameters we can't
+* guess.
+* */
+   ready_len = sizeof(default_tcp_maxseg);
+   if (default_tcp_maxseg == -1) {
+   default_tcp_maxseg = -2;
+   fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+   if (fd < 0)
+   Warning("Failed to create a temporary socket!\n");
+   else {
+   if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
&default_tcp_maxseg,
+   &ready_len) == -1)
+   Warning("Failed to get the default value of 
TCP_MAXSEG\n");
+   }
+   }
+   if (default_tcp6_maxseg == -1) {
+   default_tcp6_maxseg = -2;
+   fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
+   if (fd >= 0) {
+   if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
&default_tcp6_maxseg,
+   &ready_len) == -1)
+   Warning("Failed ot get the default value of 
TCP_MAXSEG for IPv6\n");
+   close(fd);
+   }
+   }
+#endif
+
 
/* ensure we never return garbage */
if (errlen)
@@ -960,10 +995,24 @@ int tcp_bind_listener(struct listener *listener, char 
*errmsg, int errlen)
msg = "cannot set MSS";
err |= ERR_WARN;
}
+   } else if (ext) {
+   int tmpmaxseg = -1;
+   int defaultmss;
+   socklen_t len = sizeof(tmpmaxseg);
+
+   if (listener->addr.ss_family == AF_INET)
+   defaultmss = default_tcp_maxseg;
+   else
+   defaultmss = default_tcp6_maxseg;
+
+   getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &tmpmaxseg, &len);
+   if (tmpmaxseg != defaultmss &&
+   setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG,
+   &defaultmss, sizeof(defaultmss)) == -1) {
+   msg = "cannot set MSS";
+   err |= ERR_WARN;
+   }
}
-   /* XXX: No else, the way to get the default MSS will vary from system
-* to system.
-*/
 #endif
 #if defined(TCP_USER_TIMEOUT)
if (listener->tcp_ut) {
-- 
2.9.3

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Steven Davidovitz

I had a problem testing it on Mac OS X, because cmsghdr is aligned to 4
bytes. I changed the CMSG_ALIGN(sizeof(struct cmsghdr)) call to CMSG_LEN(0)
to fix it.

On Wed, Apr 12, 2017 at 10:41 AM, Olivier Houchard 
wrote:

> Yet another patch, on top of the previous ones.
> This one tries to get the default value of TCP_MAXSEG by creating a
> temporary
> TCP socket, so that one can remove the "mss" entry from its configuration
> file,
>  and reset the mss value for any transferred socket from the old process.
>
> Olivier
>

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread James Brown

This just hit us again on a different set of load balancers... if there's a
listen socket overflow on a domain socket during graceful, haproxy
completely deletes the domain socket and becomes inaccessible.

On Tue, Feb 21, 2017 at 6:47 PM, James Brown  wrote:

> Under load, we're sometimes seeing a situation where HAProxy will
> completely delete a bound unix domain socket after a reload.
>
> The "bad flow" looks something like the following:
>
>
>- haproxy is running on pid A, bound to /var/run/domain.sock (via a
>bind line in a frontend)
>- we run `haproxy -sf A`, which starts a new haproxy on pid B
>- pid B binds to /var/run/domain.sock.B
>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in
>uxst_bind_listener)
>- in the mean time, there are a zillion connections to
>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted
>- pid B signals pid A to shut down
>- pid A runs the destroy_uxst_socket function and tries to connect to
>/var/run/domain.sock to see if it's still in use. The connection fails
>(because the backlog is full). Pid A unlinks /var/run/domain.sock.
>Everything is sad forever now.
>
> I'm thinking about just commenting out the call to destroy_uxst_socket
> since this is all on a tmpfs and we don't really care if spare sockets are
> leaked when/if we change configuration in the future. Arguably, the
> solution should be something where we don't overflow the listen socket at
> all; I'm thinking about also binding to a TCP port on localhost and just
> using that for the few seconds it takes to reload (since otherwise we run
> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to
> unlink the socket, though.
>
> This has proven extremely irritating to reproduce (since it only occurs if
> there's enough load to fill up the backlog on the socket between when pid B
> starts up and when pid A shuts down), but I'm pretty confident that what I
> described above is happening, since periodically on reloads the domain
> socket isn't there and this code fits.
>
> Our configs are quite large, so I'm not reproducing them here. The reason
> we bind on a domain socket at all is because we're running two sets of
> haproxies — one in multi-process mode doing TCP-mode SSL termination
> pointing back over a domain socket to a single-process haproxy applying all
> of our actual config.
>
> --
> James Brown
> Systems 
> Engineer
>



-- 
James Brown
Engineer

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread Andrew Smalley

HI James

When you do a graceful reload of haproxy this is what happens.

1. the old process will accept no more connections and the stats page is
stopped and so is the socket
2. a new haproxy instance is started where new clients get connected to,
and this has the live socket
3. when the old haproxy instance has no more clients left it dies silently
leaving all the clients on the new haproxy instance.

This is expected behavior as you want the first haproxy to die when the
last client leaves.


Regards

Andrew Smalley

Loadbalancer.org Ltd.



On 12 April 2017 at 19:32, James Brown  wrote:

> This just hit us again on a different set of load balancers... if there's
> a listen socket overflow on a domain socket during graceful, haproxy
> completely deletes the domain socket and becomes inaccessible.
>
> On Tue, Feb 21, 2017 at 6:47 PM, James Brown  wrote:
>
>> Under load, we're sometimes seeing a situation where HAProxy will
>> completely delete a bound unix domain socket after a reload.
>>
>> The "bad flow" looks something like the following:
>>
>>
>>- haproxy is running on pid A, bound to /var/run/domain.sock (via a
>>bind line in a frontend)
>>- we run `haproxy -sf A`, which starts a new haproxy on pid B
>>- pid B binds to /var/run/domain.sock.B
>>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in
>>uxst_bind_listener)
>>- in the mean time, there are a zillion connections to
>>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted
>>- pid B signals pid A to shut down
>>- pid A runs the destroy_uxst_socket function and tries to connect to
>>/var/run/domain.sock to see if it's still in use. The connection fails
>>(because the backlog is full). Pid A unlinks /var/run/domain.sock.
>>Everything is sad forever now.
>>
>> I'm thinking about just commenting out the call to destroy_uxst_socket
>> since this is all on a tmpfs and we don't really care if spare sockets are
>> leaked when/if we change configuration in the future. Arguably, the
>> solution should be something where we don't overflow the listen socket at
>> all; I'm thinking about also binding to a TCP port on localhost and just
>> using that for the few seconds it takes to reload (since otherwise we run
>> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to
>> unlink the socket, though.
>>
>> This has proven extremely irritating to reproduce (since it only occurs
>> if there's enough load to fill up the backlog on the socket between when
>> pid B starts up and when pid A shuts down), but I'm pretty confident that
>> what I described above is happening, since periodically on reloads the
>> domain socket isn't there and this code fits.
>>
>> Our configs are quite large, so I'm not reproducing them here. The reason
>> we bind on a domain socket at all is because we're running two sets of
>> haproxies — one in multi-process mode doing TCP-mode SSL termination
>> pointing back over a domain socket to a single-process haproxy applying all
>> of our actual config.
>>
>> --
>> James Brown
>> Systems 
>> Engineer
>>
>
>
>
> --
> James Brown
> Engineer
>

Re: ModSecurity: First integration patches

2017-04-12 Thread Aleksandar Lazic


Hi.

Am 12-04-2017 10:08, schrieb Thierry Fournier:

On 12 Apr 2017, at 09:57, Aleksandar Lazic  wrote:



Am 11-04-2017 10:49, schrieb Thierry Fournier:

Hi list
I join one usage of HAProxy / SPOE, it is WAF offloading.
These patches are a first version, it have some limitations describe
in the README file in the directory contrib/modsecurity.
- Christopher, please check the patch "BUG/MINOR", it is about spoe
  functions.
- The exemple of ModSecurity compilation can be improved. It is based
  on my local distro.
The feedback are welcome.


I agree with Olivier that's really great stuff ;-)



thanks.



To which branch (I assume master) can I apply this patch?



You’re right. The core haproxy patch is very light, I guess that the
patch will be applied on any branch from 1.7.0 (needs SPOE)



I will then create a docker image for testing ;-)



Great !


Do you have the patches as files where I can download it?
It's easier for docker to call a 'curl -vLO ...' then to go across a 
mail body ;-)



Thierry


Aleks

Re: ModSecurity: First integration patches

2017-04-12 Thread thierry . fournier

On Wed, 12 Apr 2017 21:21:58 +0200
Aleksandar Lazic  wrote:

> Hi.
> 
> Am 12-04-2017 10:08, schrieb Thierry Fournier:
> >> On 12 Apr 2017, at 09:57, Aleksandar Lazic  wrote:
> >> 
> >> 
> >> 
> >> Am 11-04-2017 10:49, schrieb Thierry Fournier:
> >>> Hi list
> >>> I join one usage of HAProxy / SPOE, it is WAF offloading.
> >>> These patches are a first version, it have some limitations describe
> >>> in the README file in the directory contrib/modsecurity.
> >>> - Christopher, please check the patch "BUG/MINOR", it is about spoe
> >>>   functions.
> >>> - The exemple of ModSecurity compilation can be improved. It is based
> >>>   on my local distro.
> >>> The feedback are welcome.
> >> 
> >> I agree with Olivier that's really great stuff ;-)
> > 
> > 
> > thanks.
> > 
> > 
> >> To which branch (I assume master) can I apply this patch?
> > 
> > 
> > You’re right. The core haproxy patch is very light, I guess that the
> > patch will be applied on any branch from 1.7.0 (needs SPOE)
> > 
> > 
> >> I will then create a docker image for testing ;-)
> > 
> > 
> > Great !
> 
> Do you have the patches as files where I can download it?
> It's easier for docker to call a 'curl -vLO ...' then to go across a 
> mail body ;-)


Not sure to understand. I given the patches as file. Note that I'm
testing new email client. So I put the patches here:

   
http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch
   
http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch
   http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch

Thierry

> >>> Thierry
> >> 
> >> Aleks
>

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard

On Wed, Apr 12, 2017 at 11:19:37AM -0700, Steven Davidovitz wrote:
> I had a problem testing it on Mac OS X, because cmsghdr is aligned to 4
> bytes. I changed the CMSG_ALIGN(sizeof(struct cmsghdr)) call to CMSG_LEN(0)
> to fix it.
> 

Oh right, I'll change that.
Thanks a lot !

Olivier

Lua memory allocator

2017-04-12 Thread Willy Tarreau

Thierry,

while instrumenting my malloc/free functions to debug a problem, I was
hit by a malloc/realloc inconsistency in the Lua allocator. The problem
is that luaL_newstate() uses malloc() to create its first objects and
only after this one we change the allocator to use ours. Thus on the
first realloc() call it fails because it tries to reallocate using one
function something allocated differently earlier.

I found some info regarding the fact that instead it was possible to
use lua_newstate() and pass it the allocator to use. So that's what I
did and it solve the problem for me.

I think that it results in more accurately accounting the memory in use
by the Lua stack since even the very first malloc is counted now. But
could you take a quick look to ensure I didn't overlook anything ? I'm
attaching the patch, if you're fine with it I can merge it.

Thanks,
Willy
>From a1ee4ff02e0755d9c8237615fe5ca7124c70c1ad Mon Sep 17 00:00:00 2001
From: Willy Tarreau 
Date: Wed, 12 Apr 2017 21:40:29 +0200
Subject: MINOR: lua: ensure the memory allocator is used all the time

luaL_setstate() uses malloc() to initialize the first objects, and only
after this we replace the allocator. This creates trouble when replacing
the standard memory allocators during debugging sessions since the new
allocator is used to realloc() an area previously allocated using the
default malloc().

Lua provides lua_newstate() in addition to luaL_newstate(), which takes
an allocator for the initial malloc. This is exactly what we need, and
this patch does this and fixes the problem.

This has no impact outside of debugging sessions and there's no need to
backport this.
---
 src/hlua.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/hlua.c b/src/hlua.c
index 5383fe9..9b59194 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -7182,7 +7182,7 @@ void hlua_init(void)
gL.Mref = LUA_REFNIL;
gL.flags = 0;
LIST_INIT(&gL.com);
-   gL.T = luaL_newstate();
+   gL.T = lua_newstate(hlua_alloc, &hlua_global_allocator);
hlua_sethlua(&gL);
gL.Tref = LUA_REFNIL;
gL.task = NULL;
-- 
1.7.12.1

Re: ModSecurity: First integration patches

2017-04-12 Thread Aleksandar Lazic




Am 12-04-2017 21:28, schrieb thierry.fourn...@arpalert.org:

On Wed, 12 Apr 2017 21:21:58 +0200
Aleksandar Lazic  wrote:


Hi.

Am 12-04-2017 10:08, schrieb Thierry Fournier:
>> On 12 Apr 2017, at 09:57, Aleksandar Lazic  wrote:
>>
>>
>>
>> Am 11-04-2017 10:49, schrieb Thierry Fournier:
>>> Hi list
>>> I join one usage of HAProxy / SPOE, it is WAF offloading.
>>> These patches are a first version, it have some limitations describe
>>> in the README file in the directory contrib/modsecurity.
>>> - Christopher, please check the patch "BUG/MINOR", it is about spoe
>>>   functions.
>>> - The exemple of ModSecurity compilation can be improved. It is based
>>>   on my local distro.
>>> The feedback are welcome.
>>
>> I agree with Olivier that's really great stuff ;-)
>
>
> thanks.
>
>
>> To which branch (I assume master) can I apply this patch?
>
>
> You’re right. The core haproxy patch is very light, I guess that the
> patch will be applied on any branch from 1.7.0 (needs SPOE)
>
>
>> I will then create a docker image for testing ;-)
>
>
> Great !

Do you have the patches as files where I can download it?
It's easier for docker to call a 'curl -vLO ...' then to go across a
mail body ;-)



Not sure to understand. I given the patches as file. Note that I'm
testing new email client. So I put the patches here:

http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch
http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch
http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch


I'm so sorry for the rush. :-(

I have seen to late that you have send the patches to the list.

Thanks for the links. I will take more care in the future.


Thierry


Aleks

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread James Brown

Hi Andrew:

Thanks for you feedback, but I'm describing a very specific bug wherein the
old haproxy will unlink the new haproxy's bound unix domain socket upon
reload due to a race condition in the domain socket cleanup code if a
listen overflow occurs while the graceful is in process.

On Wed, Apr 12, 2017 at 11:39 AM, Andrew Smalley 
wrote:

> HI James
>
> When you do a graceful reload of haproxy this is what happens.
>
> 1. the old process will accept no more connections and the stats page is
> stopped and so is the socket
> 2. a new haproxy instance is started where new clients get connected to,
> and this has the live socket
> 3. when the old haproxy instance has no more clients left it dies silently
> leaving all the clients on the new haproxy instance.
>
> This is expected behavior as you want the first haproxy to die when the
> last client leaves.
>
>
> Regards
>
> Andrew Smalley
>
> Loadbalancer.org Ltd.
>
>
>
> On 12 April 2017 at 19:32, James Brown  wrote:
>
>> This just hit us again on a different set of load balancers... if there's
>> a listen socket overflow on a domain socket during graceful, haproxy
>> completely deletes the domain socket and becomes inaccessible.
>>
>> On Tue, Feb 21, 2017 at 6:47 PM, James Brown  wrote:
>>
>>> Under load, we're sometimes seeing a situation where HAProxy will
>>> completely delete a bound unix domain socket after a reload.
>>>
>>> The "bad flow" looks something like the following:
>>>
>>>
>>>- haproxy is running on pid A, bound to /var/run/domain.sock (via a
>>>bind line in a frontend)
>>>- we run `haproxy -sf A`, which starts a new haproxy on pid B
>>>- pid B binds to /var/run/domain.sock.B
>>>- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in
>>>uxst_bind_listener)
>>>- in the mean time, there are a zillion connections to
>>>/var/run/domain.sock and pid B isn't started up yet; backlog is exhausted
>>>- pid B signals pid A to shut down
>>>- pid A runs the destroy_uxst_socket function and tries to connect
>>>to /var/run/domain.sock to see if it's still in use. The connection fails
>>>(because the backlog is full). Pid A unlinks /var/run/domain.sock.
>>>Everything is sad forever now.
>>>
>>> I'm thinking about just commenting out the call to destroy_uxst_socket
>>> since this is all on a tmpfs and we don't really care if spare sockets are
>>> leaked when/if we change configuration in the future. Arguably, the
>>> solution should be something where we don't overflow the listen socket at
>>> all; I'm thinking about also binding to a TCP port on localhost and just
>>> using that for the few seconds it takes to reload (since otherwise we run
>>> out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to
>>> unlink the socket, though.
>>>
>>> This has proven extremely irritating to reproduce (since it only occurs
>>> if there's enough load to fill up the backlog on the socket between when
>>> pid B starts up and when pid A shuts down), but I'm pretty confident that
>>> what I described above is happening, since periodically on reloads the
>>> domain socket isn't there and this code fits.
>>>
>>> Our configs are quite large, so I'm not reproducing them here. The
>>> reason we bind on a domain socket at all is because we're running two sets
>>> of haproxies — one in multi-process mode doing TCP-mode SSL termination
>>> pointing back over a domain socket to a single-process haproxy applying all
>>> of our actual config.
>>>
>>> --
>>> James Brown
>>> Systems 
>>> Engineer
>>>
>>
>>
>>
>> --
>> James Brown
>> Engineer
>>
>
>


-- 
James Brown
Engineer

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread Andrew Smalley

HI James

Thank you for your reply.

I do not see how the old haproxy being on a separate PID could do anything
with a socket created by a new PID.

Do you bring up your new instance with real servers in a maintenance
state? this seems to be required to do a correct handover before making
them live and active/ready to handle connections.

Also there is a SYN_BLOCK firewall rule required during the reload? I ask
because we have had no reports of such a race condition.

Regards

Andrew Smalley

Loadbalancer.org Ltd.

On 12 April 2017 at 23:34, James Brown  wrote:

> Hi Andrew:
>
> Thanks for you feedback, but I'm describing a very specific bug wherein
> the old haproxy will unlink the new haproxy's bound unix domain socket upon
> reload due to a race condition in the domain socket cleanup code if a
> listen overflow occurs while the graceful is in process.
>
> On Wed, Apr 12, 2017 at 11:39 AM, Andrew Smalley <
> asmal...@loadbalancer.org> wrote:
>
>> HI James
>>
>> When you do a graceful reload of haproxy this is what happens.
>>
>> 1. the old process will accept no more connections and the stats page is
>> stopped and so is the socket
>> 2. a new haproxy instance is started where new clients get connected to,
>> and this has the live socket
>> 3. when the old haproxy instance has no more clients left it dies
>> silently leaving all the clients on the new haproxy instance.
>>
>> This is expected behavior as you want the first haproxy to die when the
>> last client leaves.
>>
>>
>> Regards
>>
>> Andrew Smalley
>>
>> Loadbalancer.org Ltd.
>>
>>
>>
>> On 12 April 2017 at 19:32, James Brown  wrote:
>>
>>> This just hit us again on a different set of load balancers... if
>>> there's a listen socket overflow on a domain socket during graceful,
>>> haproxy completely deletes the domain socket and becomes inaccessible.
>>>
>>> On Tue, Feb 21, 2017 at 6:47 PM, James Brown 
>>> wrote:
>>>
 Under load, we're sometimes seeing a situation where HAProxy will
 completely delete a bound unix domain socket after a reload.

 The "bad flow" looks something like the following:

- haproxy is running on pid A, bound to /var/run/domain.sock (via a
bind line in a frontend)
- we run `haproxy -sf A`, which starts a new haproxy on pid B
- pid B binds to /var/run/domain.sock.B
- pid B moves /var/run/domain.sock.B to /var/run/domain.sock (in
uxst_bind_listener)
- in the mean time, there are a zillion connections to
/var/run/domain.sock and pid B isn't started up yet; backlog is 
 exhausted
- pid B signals pid A to shut down
- pid A runs the destroy_uxst_socket function and tries to connect
to /var/run/domain.sock to see if it's still in use. The connection 
 fails
(because the backlog is full). Pid A unlinks /var/run/domain.sock.
Everything is sad forever now.

 I'm thinking about just commenting out the call to destroy_uxst_socket
 since this is all on a tmpfs and we don't really care if spare sockets are
 leaked when/if we change configuration in the future. Arguably, the
 solution should be something where we don't overflow the listen socket at
 all; I'm thinking about also binding to a TCP port on localhost and just
 using that for the few seconds it takes to reload (since otherwise we run
 out of ephemeral sockets to 127.0.0.1); it still seems wrong for haproxy to
 unlink the socket, though.

 This has proven extremely irritating to reproduce (since it only occurs
 if there's enough load to fill up the backlog on the socket between when
 pid B starts up and when pid A shuts down), but I'm pretty confident that
 what I described above is happening, since periodically on reloads the
 domain socket isn't there and this code fits.

 Our configs are quite large, so I'm not reproducing them here. The
 reason we bind on a domain socket at all is because we're running two sets
 of haproxies — one in multi-process mode doing TCP-mode SSL termination
 pointing back over a domain socket to a single-process haproxy applying all
 of our actual config.

 --
 James Brown
 Systems 
 Engineer

>>>
>>>
>>>
>>> --
>>> James Brown
>>> Engineer
>>>
>>
>>
>
>
> --
> James Brown
> Engineer
>

Re: ModSecurity: First integration patches

2017-04-12 Thread Aleksandar Lazic




Am 12-04-2017 23:33, schrieb Aleksandar Lazic:

Am 12-04-2017 21:28, schrieb thierry.fourn...@arpalert.org:

On Wed, 12 Apr 2017 21:21:58 +0200
Aleksandar Lazic  wrote:


[snipp]


Do you have the patches as files where I can download it?
It's easier for docker to call a 'curl -vLO ...' then to go across a
mail body ;-)


Not sure to understand. I given the patches as file. Note that I'm
testing new email client. So I put the patches here:

http://www.arpalert.org/0001-BUG-MINOR-change-header-declared-function-to-static-.patch
http://www.arpalert.org/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch
http://www.arpalert.org/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch


I'm so sorry for the rush. :-(

I have seen to late that you have send the patches to the list.

Thanks for the links. I will take more care in the future.


I have now build the haproxy with modsecurity on centos 7.3 ;-)

I have used this file for modsecurity.
https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0/master/crs-setup.conf.example

###
/usr/local/bin/modsecurity -f crs-setup.conf.example
1492041223.145110 [00] ModSecurity for nginx (STABLE)/2.9.1 
(http://www.modsecurity.org/) configured.
1492041223.145159 [00] ModSecurity: APR compiled version="1.4.8"; loaded 
version="1.4.8"
1492041223.145193 [00] ModSecurity: PCRE compiled version="8.32 "; 
loaded version="8.32 2012-11-30"

1492041223.145197 [00] ModSecurity: LIBXML compiled version="2.9.1"
1492041223.145200 [00] ModSecurity: Status engine is currently disabled, 
enable it by set SecStatusEngine to On.

1492041228.152877 [01] 0 clients connected
1492041228.153037 [02] 0 clients connected
1492041228.153069 [03] 0 clients connected
...
###

It was a little bit challenging.

.) the patches apply only on haproxy 1.8 because some files does not 
exists on 1.7 ( e. g. include/proto/spoe.h )


git clone http://git.haproxy.org/git/haproxy.git/

patch -d haproxy -p 1 -i 
/usr/src/0001-BUG-MINOR-change-header-declared-function-to-static-.patch
patch -d haproxy -p 1 -i 
/usr/src/0002-MINOR-Add-binary-encoding-request-sample-fetch.patch
patch -d haproxy -p 1 -i 
/usr/src/0003-MINOR-Add-ModSecurity-wrapper-as-contrib.patch


.) you will need a lot of devel packages inclusive some httpd one.

yum install -y apr-devel apr-util-devel gcc make libevent-devel 
libxml2-devel libcurl-devel httpd-devel pcre-devel yajl-devel


.) I will figure out which runtime packages will be necessary.
.) I have started a Dockerfile which you can find at github.

https://github.com/git001/haproxy-waf/blob/master/Dockerfile

Open questions for me.

.) How is the transfer-encoding handled (a. k. a. streaming)?
.) How big can a content be? Where can we define some limits?
.) How can the rule-set be reloaded? stop & start || gracefully?

Again thanks Thierry for your work this looks very good.

Regards
Aleks

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread Michael Ezzell

On Apr 12, 2017 6:49 PM, "Andrew Smalley"  wrote:

HI James

Thank you for your reply.

I do not see how the old haproxy being on a separate PID could do anything
with a socket created by a new PID.


How?  Easily.  Unix domain sockets are presented as files.  *Any* process
can unlink a domain socket right out from under you, just like any file,
even if you have an open handle... domain sockets aren't owned by a pid.

This report, prima facie, seems entirely plausible.

Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)

2017-04-12 Thread Lincoln Stern

Thanks Bryan,

The problem I'm having is isolated to the first one second of the
connection not the end

Here is a summary of the tcp traffic.  Hopefully it makes the example more
clear.

*client connects to haproxy:  (all good)*
38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq
2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr
0,nop,wscale 7], length 0
38.057156 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [S.], seq
3284611992, ack 2113072543, win 43690, options [mss 65495,sackOK,TS val
82055529 ecr 82055529,nop,wscale 7], length 0
38.057178 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [.], ack 1, win
342, options [nop,nop,TS val 82055529 ecr 82055529], length 0

*haproxy starts connecting to server (SYN) (good)*
38.057295 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [S], seq
35567, win 29200, options [mss 1460,sackOK,TS val 82055529 ecr
0,nop,wscale 7], length 0

*client sends 198 bytes to initiate SSL connection and we have an ACK
 (good)*
38.060539 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [P.], seq 1:199,
ack 1, win 342, options [nop,nop,TS val 82055530 ecr 82055529], length 198
38.060598 IP 127.0.0.1.9011 > 127.0.0.1.39888: Flags [.], ack 199, win
350, options [nop,nop,TS val 82055530 ecr 82055530], length 0

*haproxy finishes connecting to the server (SYNACK/ACK) (good)*
38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq
4125907118, ack 35568, win 28960, options [mss 1460,sackOK,TS val
662461622 ecr 82055529,nop,wscale 8], length 0
38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1,
win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0

*And now there is nothing for 5 seconds when we should have seen a data
packet from haproxy to the server with the "length 198" payload. *

*It appears that haproxy never tried to send the data!?!?  *

*The server then disconnects 5 seconds into the transaction since it got no
data. (that is the way the server is suppose to behave)*
43.183207 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [F.], seq 1,
ack 1, win 114, options [nop,nop,TS val 662466683 ecr 82055545], length 0


On Mon, Apr 10, 2017 at 7:58 PM, Bryan Talbot 
wrote:

>
> On Apr 8, 2017, at Apr 8, 2:24 PM, Lincoln Stern 
> wrote:
>
> I'm not sure how to interpret this, but it appears that haproxy is dropping
> client payload intermittently (1/100).  I have included tcpdumps and logs
> to
> show what is happening.
>
> Am I doing something wrong?  I have no idea what could be causing this or
> how
> to go about debugging it.  I cannot reproduce it, but I do observe in
> production ~2 times
> a day across 20 instances and 2K connections.
>
> Any help or advice would be greatly appreciated.
>
>
>
>
> You’re in TCP mode with 60 second timeouts. So, if the connection is idle
> for that long then the proxy will disconnect. If you need idle connections
> to stick around longer and mix http and tcp traffic then you probably want
> to set “timeout tunnel” to however long you’re willing to let idle tcp
> connections sit around and not impact http timeouts. If you only need
> long-lived tcp “tunnel” connections, then you can instead just increase
> both your “timeout client” and “timeout server” timeouts to cover your
> requirements.
>
> -Bryan
>
>
>
> What I'm trying to accomplish is to provide HA availability over two routes
> (i.e. internet providers).  One acts as primary and I gave it a "static-rr"
> "weight" of 256 and the other as backup and has a weight of "1".  Backup
> should only be used in case of primary failure.
>
>
> log:
> Apr  4 18:55:27 app055 haproxy[13666]: 127.0.0.1:42262
> [04/Apr/2017:18:54:41.585] ws-local servers/server1 1/86/45978 4503 5873 --
> 0/0/0/0/0 0/0
> Apr  4 22:46:37 app055 haproxy[13666]: 127.0.0.1:47130
> [04/Apr/2017:22:46:36.931] ws-local servers/server1 1/62/663 7979 517 --
> 0/0/0/0/0 0/0
> Apr  4 22:46:38 app055 haproxy[13666]: 127.0.0.1:32931
> [04/Apr/2017:22:46:37.698] ws-local servers/server1 1/55/405 3062 553 --
> 1/1/1/1/0 0/0
> Apr  4 22:46:43 app055 haproxy[13666]: 127.0.0.1:41748
> [04/Apr/2017:22:46:43.190] ws-local servers/server1 1/115/452 7979 517 --
> 2/2/2/2/0 0/0
> Apr  4 22:46:46 app055 haproxy[13666]: 127.0.0.1:57226
> [04/Apr/2017:22:46:43.576] ws-local servers/server1 1/76/3066 2921 538 --
> 1/1/1/1/0 0/0
> Apr  4 22:46:47 app055 haproxy[13666]: 127.0.0.1:39656
> [04/Apr/2017:22:46:47.072] ws-local servers/server1 1/67/460 8254 528 --
> 1/1/1/1/0 0/0
> Apr  4 22:47:38 app055 haproxy[13666]: 127.0.0.1:39888
> [04/Apr/2017:22:46:38.057] ws-local servers/server1 1/63/60001 0 0 cD
> 0/0/0/0/0 0/0
> Apr  5 08:44:55 app055 haproxy[13666]: 127.0.0.1:42650
> [05/Apr/2017:08:44:05.529] ws-local servers/server1 1/53/49645 4364 4113 --
> 0/0/0/0/0 0/0
>
>
> tcpdump:
> 22:46:38.057127 IP 127.0.0.1.39888 > 127.0.0.1.9011: Flags [S], seq
> 2113072542, win 43690, options [mss 65495,sackOK,TS val 82055529 ecr
> 0,nop,wscale 7], length 0
> 22:46:38.05715

Re: haproxy deleting domain socket on graceful reload if backlog overflows

2017-04-12 Thread Willy Tarreau

Hi Andrew, James,

On Wed, Apr 12, 2017 at 11:46:57PM +0100, Andrew Smalley wrote:
> I do not see how the old haproxy being on a separate PID could do anything
> with a socket created by a new PID.

That's what James explained. The old process tries to clean up before
leaving and tries to clean its own socket, but in practice it ends up
cleaning the new socket if the connection attempt fails due to the new
socket being saturated. That's done in destroy_uxst_socket().

> Also there is a SYN_BLOCK firewall rule required during the reload? I ask
> because we have had no reports of such a race condition.

That's totally irrelevant. In his case this is unix sockets, not TCP sockets.

James, I agree with you that what is done in destroy_uxst_socket() is
completely stupid. Even the comment saying "we might have been chrooted"
indicates that in some cases we could end up removing someone else's
socket (if permissions allow for it of course).

Given that most of the time users will run chrooted anyway, these sockets
are almost never cleaned up. So I'd vote for simply killing the contents
of destroy_uxst_socket() until we find a better idea to ensure we're really
seeing *this* process' socket. For example we could note the socket's inode
upon startup and only remove the socket if it matches this stored inode.
This is just an idea, of course.

In the mean time you can work around the problem by applying just the
one-line patch below :

  static void destroy_uxst_socket(const char *path)
  {
struct sockaddr_un addr;
int sock, ret;

+   return;
/* if the path was cleared, we do nothing */
if (!*path)
return;

Thanks for your detailed analysis. Would you be willing to provide a patch ?
You can reuse your original e-mail as the commit message, it's fully detailed
and explains the situation very well.

Thanks!
Willy

Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)

2017-04-12 Thread Willy Tarreau

Hi Lincoln,

On Wed, Apr 12, 2017 at 08:24:41PM -0400, Lincoln Stern wrote:
(...)
> *haproxy finishes connecting to the server (SYNACK/ACK) (good)*
> 38.120527 IP 99.99.99.99.8000 > 10.10.10.10.34289: Flags [S.], seq
> 4125907118, ack 35568, win 28960, options [mss 1460,sackOK,TS val
> 662461622 ecr 82055529,nop,wscale 8], length 0
> 38.120619 IP 10.10.10.10.34289 > 99.99.99.99.8000: Flags [.], ack 1,
> win 229, options [nop,nop,TS val 82055545 ecr 662461622], length 0
> 
> *And now there is nothing for 5 seconds when we should have seen a data
> packet from haproxy to the server with the "length 198" payload. *
> 
> *It appears that haproxy never tried to send the data!?!?  *

Thanks for this capture! This is a regression introduced in 1.7.3 while
trying to fix a bug in 1.7.2, it was fixed later by this commit :

  commit 51f2336ef7738c945ccedf5f1179825ac401862c
  Author: Willy Tarreau 
  Date:   Sat Mar 18 17:11:37 2017 +0100

BUG/MAJOR: stream-int: do not depend on connection flags to detect connectio
(...)

Please upgrade to 1.7.5 and it will be OK.

Regards,
Willy

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Willy Tarreau

On Wed, Apr 12, 2017 at 07:41:43PM +0200, Olivier Houchard wrote:
> + if (default_tcp_maxseg == -1) {
> + default_tcp_maxseg = -2;
> + fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
> + if (fd < 0)
> + Warning("Failed to create a temporary socket!\n");
> + else {
> + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
> &default_tcp_maxseg,
> + &ready_len) == -1)
> + Warning("Failed to get the default value of 
> TCP_MAXSEG\n");

Olivier, you're missing a close(fd) here, it'll leak this fd.

> + }
> + }
> + if (default_tcp6_maxseg == -1) {
> + default_tcp6_maxseg = -2;
> + fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
> + if (fd >= 0) {
> + if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
> &default_tcp6_maxseg,
> + &ready_len) == -1)
> + Warning("Failed ot get the default value of 
> TCP_MAXSEG for IPv6\n");
> + close(fd);
> + }
> + }
> +#endif
> +
>  
(...)

> + getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &tmpmaxseg, &len);
> + if (tmpmaxseg != defaultmss &&
> + setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG,
> + &defaultmss, sizeof(defaultmss)) == -1) {

Please fix the alignment for the argument here, it's a bit confusing :-)

Otherwise looks good. I think it was a good idea to create temporary
sockets to retrieve some default settings. That may be something we
could generalize to guess other parameters or capabilities if needed
in the future. For example we could use this to detect whether or not
IPv6 is supported and emit errors only once instead of every bind line.

Another use case is to always know the MSS applied to a listening socket
in order to automatically adjust the SSL maxrecord instead of assuming
1460 by default. Over time we might find other use cases.

Cheers,
Willy

Re: ModSecurity: First integration patches

Re: ModSecurity: First integration patches

Re: ModSecurity: First integration patches

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: [RFC][PATCHES] seamless reload

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: ModSecurity: First integration patches

Re: ModSecurity: First integration patches

Re: [RFC][PATCHES] seamless reload

Lua memory allocator

Re: ModSecurity: First integration patches

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: ModSecurity: First integration patches

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)

Re: haproxy deleting domain socket on graceful reload if backlog overflows

Re: low load client payload intermittently dropped with a "cD" error (v1.7.3)

Re: [RFC][PATCHES] seamless reload

27 matches

Site Navigation

Mail list logo

Footer information