Re: ACL block ignored in 2.0.25

2021-11-18 Thread Olivier Houchard
On Thu, Nov 18, 2021 at 05:59:24PM +0100, Bart van der Schans wrote:
> A bit more digging:
> 
> It looks like this commit broke it:
> http://git.haproxy.org/?p=haproxy-2.0.git;a=commitdiff;h=8ab2a364a8c8adf0965e74a41a2ff3cebd43e7a9
> 
> Re-adding the following block on line 2826 in cfgparse.c in 2.0.25 "solves"
> the issue but I have no idea if that is the right thing to do:
> 
> -   /* move any "block" rules at the beginning of the
> http-request rules */
> -   if (!LIST_ISEMPTY(&curproxy->block_rules)) {
> -   /* insert block_rules into http_req_rules at
> the beginning */
> -   curproxy->block_rules.p->n=
> curproxy->http_req_rules.n;
> -   curproxy->http_req_rules.n->p = 
> curproxy->block_rules.p;
> -   curproxy->block_rules.n->p=
> &curproxy->http_req_rules;
> -   curproxy->http_req_rules.n= 
> curproxy->block_rules.n;
> -   LIST_INIT(&curproxy->block_rules);
> -   }
> 
> Thanks
> Bart
> 
I think you are totally correct, and this is the right thing to do. That
block is unrelated to the other changes in that commit, and probably
should not have been removed.

Regards,

Olivier



Re: ACL block ignored in 2.0.25

2021-11-21 Thread Olivier Houchard
Hi Bart,

On Thu, Nov 18, 2021 at 08:06:55PM +0100, Bart van der Schans wrote:
> Here is the patch:
> 
> Fix "block" ACL by reverting accidental removal of code in
> 8ab2a364a8c8adf0965e74a41a2ff3cebd43e7a9
> ---
> diff --git a/src/cfgparse.c b/src/cfgparse.c
> index 857321e1..78f0ed76 100644
> --- a/src/cfgparse.c
> +++ b/src/cfgparse.c
> @@ -2823,6 +2823,16 @@ int check_config_validity()
> }
> }
> 
> +   /* move any "block" rules at the beginning of the http-request
> rules */
> +   if (!LIST_ISEMPTY(&curproxy->block_rules)) {
> +   /* insert block_rules into http_req_rules at the beginning
> */
> +   curproxy->block_rules.p->n= curproxy->http_req_rules.n;
> +   curproxy->http_req_rules.n->p = curproxy->block_rules.p;
> +   curproxy->block_rules.n->p= &curproxy->http_req_rules;
> +   curproxy->http_req_rules.n= curproxy->block_rules.n;
> +   LIST_INIT(&curproxy->block_rules);
> +   }
> +
> /* check validity for 'tcp-request' layer 4/5/6/7 rules */
> cfgerr += check_action_rules(&curproxy->tcp_req.l4_rules,
> curproxy, &err_code);
> cfgerr += check_action_rules(&curproxy->tcp_req.l5_rules,
> curproxy, &err_code);

Your patch has been committed as feeccc6e40e4d61b1e9fa84aca961aceabe39371

Thanks a lot !

Olivier



Re: HAProxy performance on OpenBSD

2023-01-23 Thread Olivier Houchard
Hi Marc,

On Mon, Jan 23, 2023 at 12:13:13AM -0600, Marc West wrote:
> Hi,
> 
> We have been running HAProxy on OpenBSD for serveral years (currently
> OpenBSD 7.2 / HAProxy 2.6.7) and everything has been working perfect
> until a recent event of higher than normal traffic. It was an unexpected
> flood to one site and above ~1100 cur sessions we started to see major
> impacts to all sites behind haproxy in different frontends/backends than 
> the one with heavy traffic. All sites started to respond very slowly or 
> failed to load (including requests that haproxy denies before hitting 
> backend), health checks in various other backends started flapping, and
> once the heavy traffic stopped everything returned to normal. Total 
> bandwidth was less than 50Mbps on a 10G port (Intel 82599 ix fiber NIC).
> 
> After that issue we have been doing some load testing to try to gain
> more info and make any possible tweaks. Using the ddosify load testing
> tool (-d 60 -n 7) from a single machine (different from haproxy) is
> able to reproduce the same issue we saw with real traffic.
> 
> When starting a load test HAProxy handles 400-500 requests per second
> for about 3 seconds. After the first few seconds of heavy traffic, the
> test error rate immediately starts to shoot up to 75%+ connection/read
> timeouts and other haproxy sites/health checks start to be impacted. We
> had to stop the test after only 11 seconds to restore responsiveness to
> other sites. This is a physical server with 2x E5-2698 v4 20 core CPUs
> (hyperthreading disabled) and the haproxy process uses about 1545% CPU
> under this load. Overall CPU utilization is 21% user, 0% nice, 37% sys,
> 18% spin, 0.7% intr, 23.1% idle. There was no noticeable impact on
> connections to other services that this box NATs via PF to other servers
> outside of haproxy. 
> 
> Installing FreeBSD 13.1 on an identical machine and trying the same test
> on the same 2.6.7 with the same config and backend servers, the results
> are more what I would expect out of this hardware - haproxy has no
> problems handling over 10,000 req/sec and 40k connections without
> impacting any traffic/health checks, and only about 30% overall CPU
> usage at that traffic level. 100% success rate on the load tests with
> plenty of headroom on the haproxy box to handle way more. The backend
> servers were actually the bottleneck in the FreeBSD test.
> 
> I understand that raw performance on OpenBSD is sometimes not as high as
> other OSes in some scenarios, but the difference of 500 vs 10,000+
> req/sec and 1100 vs 40,000 connections here is very large so I wanted to
> see if there are any thoughts, known issues, or tunables that could
> possibly help improve HAProxy throughput on OpenBSD?
> 
> The usual OS tunables openfiles-cur/openfiles-max are raised to 200k,
> kern.maxfiles=205000 (openfiles peaked at 15k), and haproxy stats
> reports those as expected. PF state limit is raised to 1 million and
> peaked at 72k in use. BIOS power profile is set to max performance.
> 
> pid = 78180 (process #1, nbproc = 1, nbthread = 32)
> uptime = 1d 19h10m11s
> system limits: memmax = unlimited; ulimit-n = 20
> maxsock = 20; maxconn = 99904; maxpipes = 0
> 
> No errors that I can see in logs about hitting any limits. There is no
> change in results with http vs https, http/1.1 vs h2, with or without
> httplog, or reducing nbthread on this 40 core machine. If there are any
> other details I can provide please let me know.
> 
> Thanks in advance for any input!
> 
> 
> global
>   chroot  /var/haproxy
>   daemon  
>   log  127.0.0.1 local2
>   nbthread  32

I wonder if the problem comes from OpenBSD's thread implementation, I
fear you may have too much contention in the kernel, especially on a
NUMA machine such as yours. Does using, say, only 4 threads, make things
better?

Regards,

Olivier



Re: HAProxy performance on OpenBSD

2023-01-24 Thread Olivier Houchard
On Tue, Jan 24, 2023 at 11:05:37PM +0100, Willy Tarreau wrote:
> On Tue, Jan 24, 2023 at 02:15:08PM -0600, Marc West wrote:
> > > Stupid question but I prefer to ask in order to be certain, are all of
> > > these 32 threads located on the same physical CPU ? I just want to be
> > > sure that locks (kernel or user) are not traveling between multiple CPU
> > > sockets, as that ruins scalability.
> > 
> > Very good observation. This is a 2 socket system, 20 cores per socket,
> > and since there is no affinity in OpenBSD unfortunately the 32 threads
> > are not on the same physical CPU.
> 
> Ah, then that's particularly bad. When you're saying "there is no
> affinity", do you mean the OS doesn't implement anything or only that
> haproxy doesn't support it ? Could you please check if you have man
> pages about "cpuset_setaffinity" or "sched_setaffinity" ? The former
> is the FreeBSD version and the second the Linux version.

Unfortunately, having a quick look at OpenBSD's sources, I don't think
they provide anything that could look like that. In fact, I'm not sure
there is any way to restrict a process to a specific CPU.

[...]

> Oh thank you very much. From what I'm seeing we're here:
> 
>   0x0af892c770b0 :   mov%r12,%rdi
>   0x0af892c770b3 :   callq  0xaf892c24e40 
> 
>   0x0af892c770b8 :   mov%rax,%r12
>   0x0af892c770bb :   test   %rax,%rax
>   0x0af892c770be :   je 0xaf892c770e5 
> 
>   0x0af892c770c0 :   mov0x18(%r12),%rax
> =>0x0af892c770c5 :   mov0xa0(%rax),%r11
>   0x0af892c770cc :   test   %r11,%r11
>   0x0af892c770cf :   je 0xaf892c770b0 
> 
>   0x0af892c770d1 :   mov%r12,%rdi
>   0x0af892c770d4 :   mov%r13d,%esi
> 
>   1229  conn = 
> srv_lookup_conn(&srv->per_thr[i].safe_conns, hash);
>   1230  while (conn) {
>   1231==>>> if (conn->mux->takeover && 
> conn->mux->takeover(conn, i) == 0) {
> ^^^
> 
> It's here. %rax==0, which is conn->mux when we're trying to dereference
> it to retrieve ->takeover. I can't see how it's possible to have a NULL
> mux here in an idle connection since they're placed there by the muxes
> themselves. But maybe there's a tiny race somewhere that's impossible to
> reach except under such an extreme contention between the two sockets.
> I really have no idea regarding this one for now, maybe Olivier could
> spot a race here ? Maybe we're missing a barrier somewhere.
 
One way I can see that happening, as unlikely as it may be, is if
h1_takeover() is called, but fails to allocate a task or a tasklet, and
will call h1_release(), which will set the mux to NULL and call
conn_free(). It was previously ok to happen, because conn_free() would
unconditionally remove the connection from any list, but since the idle
connections now stay in a tree, and no longer in a list, that no longer
happens.
Another possibility of h1_release() been called while still in the
idle tree is if h1_wake() gets called, but the only way I see that
happening is from mmux_stopping_process(), whatever that is, so that's
unlikely to be the problem in this case.
Anyway, I suggest doing something along the line of the patch attached,
and add a BUG_ON() in conn_free() to catch any freeing of connection
still in the idle tree.

Olivier
>From 2a0ac4b84b97aa05cd7befc5fcf45b03795f2e76 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 24 Jan 2023 23:59:32 +0100
Subject: [PATCH] MINOR: Add a BUG_ON() to detect destroying connection in idle
 list

Add a BUG_ON() in conn_free(), to check that zhen we're freeing a
connection, it is not still in the idle connections tree, otherwise the
next thread that will try to use it will probably crash.
---
 src/connection.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/connection.c b/src/connection.c
index 4a73dbcc8..97619ec26 100644
--- a/src/connection.c
+++ b/src/connection.c
@@ -498,6 +498,10 @@ void conn_free(struct connection *conn)
 	pool_free(pool_head_uniqueid, istptr(conn->proxy_unique_id));
 	conn->proxy_unique_id = IST_NULL;
 
+	/* Make sure the connection is not left in the idle connection tree */
+	if (conn->hash_node != NULL)
+		BUG_ON(conn->hash_node->node.node.leaf_p != NULL);
+
 	pool_free(pool_head_conn_hash_node, conn->hash_node);
 	conn->hash_node = NULL;
 
-- 
2.36.1



[PATCH] Add the possibility to compress requests

2023-04-03 Thread Olivier Houchard
Hi,

The attached patchset is a first attempt at adding the possibility to
compress requests, as well as responses.
It adds a new keyword for compression, "tocompress" (any better
alternative name would be appreciated). The valid values are "request",
if you want to compress requests, "response" if you want to compress
responses or "both", if you want to compress both.
The default is to compress responses only.

Any comment is more than welcome.

Thanks!

Olivier
>From 6c3e62baa888359521091387ce6ac8376a001259 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 3 Apr 2023 22:22:24 +0200
Subject: [PATCH 1/2] MINOR: compression: Make compression offload a flag

Turn compression offload into a flag in struct comp, instead of using
an int just for it.
---
 include/haproxy/compression-t.h | 5 -
 src/flt_http_comp.c | 6 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/haproxy/compression-t.h b/include/haproxy/compression-t.h
index 062f17f76..cdbf0d14e 100644
--- a/include/haproxy/compression-t.h
+++ b/include/haproxy/compression-t.h
@@ -34,10 +34,13 @@
 
 #include 
 
+/* Compression flags */
+
+#define COMP_FL_OFFLOAD		0x0001 /* Compression offload */
 struct comp {
 	struct comp_algo *algos;
 	struct comp_type *types;
-	unsigned int offload;
+	unsigned int flags;
 };
 
 struct comp_ctx {
diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c
index ceda3fd5e..507009692 100644
--- a/src/flt_http_comp.c
+++ b/src/flt_http_comp.c
@@ -451,8 +451,8 @@ select_compression_request_header(struct comp_state *st, struct stream *s, struc
 
 	/* remove all occurrences of the header when "compression offload" is set */
 	if (st->comp_algo) {
-		if ((s->be->comp && s->be->comp->offload) ||
-		(strm_fe(s)->comp && strm_fe(s)->comp->offload)) {
+		if ((s->be->comp && (s->be->comp->flags & COMP_FL_OFFLOAD)) ||
+		(strm_fe(s)->comp && (strm_fe(s)->comp->flags & COMP_FL_OFFLOAD))) {
 			http_remove_header(htx, &ctx);
 			ctx.blk = NULL;
 			while (http_find_header(htx, ist("Accept-Encoding"), &ctx, 1))
@@ -690,7 +690,7 @@ parse_compression_options(char **args, int section, struct proxy *proxy,
   args[0], args[1]);
 			ret = 1;
 		}
-		comp->offload = 1;
+		comp->flags |= COMP_FL_OFFLOAD;
 	}
 	else if (strcmp(args[1], "type") == 0) {
 		int cur_arg = 2;
-- 
2.39.1

>From 77a5657592338db2ed53189c50828b3c0ed52716 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 4 Apr 2023 00:14:36 +0200
Subject: [PATCH 2/2] MEDIUM: compression: allow to compress requests too.

Make it so we can compress requests, as well as responses.
A new compress keyword is added, "tocompress". It takes exactly one
argument. The valid values for that arguments are "response", if we want
to compress only responses, "request", if we want to compress only
requests, or "both", if we want to compress both.
The dafault value is "response".
---
 doc/configuration.txt   |  12 ++-
 include/haproxy/compression-t.h |   3 +
 src/flt_http_comp.c | 165 +++-
 3 files changed, 154 insertions(+), 26 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 586689ca2..8f86b285e 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -5077,7 +5077,7 @@ compression type  ...
 compression algo gzip
 compression type text/html text/plain
 
-  See also : "compression offload"
+  See also : "compression offload", "compression tocompress"
 
 compression offload
   Makes HAProxy work as a compression offloader only.
@@ -5099,7 +5099,15 @@ compression offload
   If this setting is used in a defaults section, a warning is emitted and the
   option is ignored.
 
-  See also : "compression type", "compression algo"
+  See also : "compression type", "compression algo", "compression tocompress"
+
+compression tocompress 
+  Makes haproxy able to compress both requests and responses.
+  Valid values are "request", to compress only requests, "response", to
+  compress only responses, or "both", when you want to compress both.
+  The default value is "response".
+
+  See also : "compression type", "compression algo", "compression offload"
 
 cookie  [ rewrite | insert | prefix ] [ indirect ] [ nocache ]
   [ postonly ] [ preserve ] [ httponly ] [ secure ]
diff --git a/include/haproxy/compression-t.h b/include/haproxy/compression-t.h
index cdbf0d14e..d15d15ee3 100644
--- a/include/haproxy/compression-t.h
+++ b/include/haproxy/compression-t.h
@@ -37,6 +37,9 @@
 /* Compression flags */
 
 #define COMP_FL_OFFLOAD		0x0001 /* Compressi

Re: [PATCH] Add the possibility to compress requests

2023-04-05 Thread Olivier Houchard
Hi,

On Tue, Apr 04, 2023 at 09:45:24PM +0200, Willy Tarreau wrote:
> Hi Olivier,
> 
> On Tue, Apr 04, 2023 at 12:29:15AM +0200, Olivier Houchard wrote:
> > Hi,
> > 
> > The attached patchset is a first attempt at adding the possibility to
> > compress requests, as well as responses.
> 
> This is pretty cool, I definitely see how this can be super useful :-)
 
:)

> > It adds a new keyword for compression, "tocompress" (any better
> > alternative name would be appreciated).
> 
> I would call it "direction" which I find more natural and understandable.
 
I like that a lot more. Thank you!

> > The valid values are "request",
> > if you want to compress requests, "response" if you want to compress
> > responses or "both", if you want to compress both.
> > The default is to compress responses only.
> 
> I have a few below (besides renaming "tocompress").
 
[...]

> > diff --git a/include/haproxy/compression-t.h 
> > b/include/haproxy/compression-t.h
> > index cdbf0d14e..d15d15ee3 100644
> > --- a/include/haproxy/compression-t.h
> > +++ b/include/haproxy/compression-t.h
> > @@ -37,6 +37,9 @@
> >  /* Compression flags */
> >  
> >  #define COMP_FL_OFFLOAD0x0001 /* Compression offload */
> > +#define COMP_FL_COMPRESS_REQ   0x0002 /* Compress requests */
> > +#define COMP_FL_COMPRESS_RES   0x0004 /* Compress responses */
> 
> So these ones would be COMP_FL_DIR_{REQ,RES}.
 
Done!

> >  struct comp {
> > struct comp_algo *algos;
> > struct comp_type *types;
> > diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c
> > index 507009692..7680e241f 100644
> > --- a/src/flt_http_comp.c
> > +++ b/src/flt_http_comp.c
> > @@ -33,7 +33,9 @@ struct flt_ops comp_ops;
> >  
> >  struct comp_state {
> > struct comp_ctx  *comp_ctx;   /* compression context */
> > +   struct comp_ctx  *comp_ctx_req; /* compression context for request */
> > struct comp_algo *comp_algo;  /* compression algorithm if not NULL */
> > +   struct comp_algo *comp_algo_req; /* Algo to be used for request */
> > unsigned int  flags;  /* COMP_STATE_* */
> 
> I find it confusing to have comp_ctx and comp_ctx_req (and same for algo).
> I'd rather see a preliminary patch renaming comp_ctx to comp_ctx_res and
> comp_algo to comp_algo_res (just perform a mechanical rename). Or you could
> also have an array [2] for each, which will even remove some ifs in the
> code.
 
Ok I made that an array.

> Also I don't understand how you can have different algos for each direction
> since the config language allows you to define "compression algo" and
> "compression tocompress" so you cannot (apparently) have a situation where
> the two algos differ. I think it can make sense to use different algos for
> each direction (at least compression settings, e.g. expensive compression
> for uploads, relayed over expensive links, little compression for downloads
> that possibly comes from a cache).
 
The idea was that while only one algo could be used for requests, the
response algorithm would depend on what was supported by the client.

> Thus maybe an approach could be to use the algo to indicate the direction
> at the same time:
> 
>compression algo foobar   => sets response only, for compatibility
>compression algo-res foobar   => sets response only, explicit syntax
>compression algo-req foobar   => sets request only, explicit syntax
> 
> Then there's probably no need to have a keyword to set it at once in
> both directions since both actions are totally independent and it's
> probably not desired to encourage users to conflate the two directions
> in a single keyword.
> 
> You might need the same for "compression types" I think, so that you
> don't try to compress regular forms or login:password that will confuse
> applications, but only large data.
 
But I did that, so now there's even more reason to have separate
algorithms.

> > -static int set_compression_response_header(struct comp_state *st,
> > -  struct stream *s,
> > -  struct http_msg *msg);
> > +static int set_compression_header(struct comp_state *st,
> > + struct stream *s,
> > + struct http_msg *msg,
> > + int for_response);
>   
> For this one as well I would call it "direction" (or "dir"), as we
>

Re: [PATCH] Add the possibility to compress requests

2023-04-06 Thread Olivier Houchard
On Thu, Apr 06, 2023 at 09:17:34PM +0200, Willy Tarreau wrote:
> > > > +/* limit compression rate */
> > > > +   if (global.comp_rate_lim > 0)
> > > > +   if (read_freq_ctr(&global.comp_bps_in) > 
> > > > global.comp_rate_lim)
> > > > +   goto fail;
> > > 
> > > I had a doubt regarding this one but I think it's indeed reasonable to
> > > use the same global value for all directions as the goal was precisely
> > > to be able to enforce a total limit on all compressed streams to limit
> > > the resource impacts.
> >  
> > I think it makes sense, unless someone wants to give more priority to
> > request compression vs response compression, or vice versa, but I'll
> > rethink it if somebody gives me a valid use-case.
> 
> We'll see, but I strongly doubt this would ever happen because those who
> want to enforce speed limits are often hosting providers who don't want
> that a single site eats all their CPU, so they configure maximum compression
> bandwidths. It dates back from the era where it was still common to start
> several daemons, which is no more the case these days. So actually if
> anything would change in the short term it would rather be to have
> per-frontend and/or per-backend total limits rather than splitting
> req/res which still concern the same application.
 
Yeah that makes sense.

> > > I have not seen anything updating the stats, so I suspect that request
> > > and response compression stats are accounted as response stats, which
> > > can be a problem showing compression rates larger than 100%. If you hover
> > > over your frontend's "Bytes / Out" fields, you'll see a yellow box with
> > > numbers such as:
> > > 
> > >   Response bytes in:  468197882863
> > >   Compression in: 117157038990
> > >   Compression out:93277104103 (79%)
> > >   Compression bypass: 0
> > >   Total bytes saved:  23879934887 (5%)
> > > 
> > > Here I'm figuring that we'll need to add the same for the upload, so
> > > these are extra stats. I can possibly help locate where to place them,
> > > but I'll need that you figure where they are updated.
> > 
> > Ok so I've had mixed feelings about this, because I thought a global
> > counter could be useful too. But ultimately I think it's better to keep
> > them separated, if only so that the percentages make sense.
> 
> Yes, and another point is that when you start with a global one and
> finally decide to introduce a separate one for one direction so that
> you can subtract it from the total to get the other one, it's not
> uncommon to see negative numbers due to rounding errors or timings,
> and that's extremely confusing for those who deal with that. Thus I
> prefer to let users sum data on their side rather than providing
> totals and separate ones later.
 
That's a good point.

> > We could
> > have had Request + Response bytes there, but I think it's less useful.
> > So I added the counters, but do not expose them for requests yet, as I'm
> > unsure where to do that exactly, but that can easily be addressed with a
> > separate commit.
> 
> Yes we can have a look later. I think we'll add new metric in the stats
> that will appear in a new tooltip on the bytes/in column of the stats
> page and that will be OK.
 
Yeah I was considering something like this, but wasn't so sure.

> Just a few tiny cosmetic comments below:

[Took care of unaligned comments]

> > From: Olivier Houchard 
> > Date: Thu, 6 Apr 2023 00:33:48 +0200
> > Subject: [PATCH 5/5] MEDIUM: compression: Make it so we can compress 
> > requests
> >  as well.
> (...)
> > diff --git a/doc/configuration.txt b/doc/configuration.txt
> > index 3468c78f3..63e8d3440 100644
> > --- a/doc/configuration.txt
> > +++ b/doc/configuration.txt
> > @@ -5099,7 +5111,15 @@ compression offload
> >If this setting is used in a defaults section, a warning is emitted and 
> > the
> >option is ignored.
> >  
> > -  See also : "compression type", "compression algo"
> > +  See also : "compression type", "compression algo", "compression "
>  ^^^
> A word is missing here, I suspect it was "direction" since that one
> references "offload" which is where we are.
 
Ahahah yeah, pretty sure I removed "tocompress" to replace it with
direction, but was probably distracted before I did so, and forgot.

> OK otherwise it looks good to me. I suggest you adjust these cosmetic
> details and directly push it.
 
Done, thanks!

> I'm having one question by the way: how did you manage to test this ?
> Did you find a server that supports decompressing requests, or do you
> only have some in-house stuff for this ? I'm wondering if we could manage
> to make a regtest for this, at least by putting some expect rules on
> the content-encoding header probably.

My test server was actually a nginx with custom LUA code to decompress
stuff. Maybe we can do something similar with haproxy?

Regards,

Olivier



Re: Clarification for ARM support

2020-03-09 Thread Olivier Houchard
Hi,

On Mon, Mar 09, 2020 at 10:33:55AM +0200, Emilio Fernandes wrote:
> Hello HAProxy team,
> 
> At https://www.haproxy.org/#plat I see that ARM architecture is supported
> for Linux with newer kernel.
> To avoid confusion could you please add next to it whether it is only 32
> bit, 64 bit or both ? Just like "x86, x86_64" before that.
> 
> Recently we started testing on ARM64 and so far it works great! Great job!
> Thank you!
> 

I guess this was written before arm64 was really a thing, but both 32bits
and 64bits arm are actively supported.

Regards,

Olivier



Re: commit 493d9dc makes a SVN-checkout stall..

2020-03-24 Thread Olivier Houchard
Hi Pieter,

On Wed, Mar 25, 2020 at 01:14:46AM +0100, PiBa-NL wrote:
> Hi List, Willy,
> 
> Today i thought lets give v2.2-dev5 a try for my production environment ;).
> Soon it turned out to cause SVN-Checkout to stall/disconnect for a
> repository we run locally in a Collab-SVN server.
> 
> I tracked it down to this commit: 493d9dc (MEDIUM: mux-h1: do not blindly
> wake up the tasklet at end of request anymore) causing the problem for the
> first time. Is there something tricky there that can be suspected to cause
> the issue.? Perhaps a patch i can try?
> 
> While 'dissecting' the issue i deleted the whole directory each time and
> performed a new svn-checkout several times. It doesn't always stall at the
> exact same point but usually after checking out around +- 20 files something
> between 0.5 and 2 MB. , the commit before that one allows me to checkout
> 500+MB through haproxy without issue.. A wireshark seems to show that
> haproxy is sending several of RST,ACK packets for a 4 different connections
> to the svn-server at the same milisecond after it was quiet for 2 seconds..
> The whole issue happens in a timeframe of start of checkout till when it
> stalls within 15 seconds.
> 
> The 'nokqueue' i usually try on my FreeBSD machine doesn't change anything.
> 
> Hope you have an idea where to look. Providing captures/logs is a bit
> difficult without some careful scrubbing..
> 
> Regards,
> PiBa-NL (Pieter)
> 

No answer yet, but I just tried to do a checkout of the FreeBSD svn tree
via haproxy, and it indeed doesn't work, it's a bit late right now, but
I'll have a look tomorrow :)

Regards,

Olivier



Re: commit 493d9dc makes a SVN-checkout stall..

2020-03-25 Thread Olivier Houchard
Hi Pieter,

On Wed, Mar 25, 2020 at 01:42:14AM +0100, Olivier Houchard wrote:
> Hi Pieter,
> 
> On Wed, Mar 25, 2020 at 01:14:46AM +0100, PiBa-NL wrote:
> > Hi List, Willy,
> > 
> > Today i thought lets give v2.2-dev5 a try for my production environment ;).
> > Soon it turned out to cause SVN-Checkout to stall/disconnect for a
> > repository we run locally in a Collab-SVN server.
> > 
> > I tracked it down to this commit: 493d9dc (MEDIUM: mux-h1: do not blindly
> > wake up the tasklet at end of request anymore) causing the problem for the
> > first time. Is there something tricky there that can be suspected to cause
> > the issue.? Perhaps a patch i can try?
> > 
> > While 'dissecting' the issue i deleted the whole directory each time and
> > performed a new svn-checkout several times. It doesn't always stall at the
> > exact same point but usually after checking out around +- 20 files something
> > between 0.5 and 2 MB. , the commit before that one allows me to checkout
> > 500+MB through haproxy without issue.. A wireshark seems to show that
> > haproxy is sending several of RST,ACK packets for a 4 different connections
> > to the svn-server at the same milisecond after it was quiet for 2 seconds..
> > The whole issue happens in a timeframe of start of checkout till when it
> > stalls within 15 seconds.
> > 
> > The 'nokqueue' i usually try on my FreeBSD machine doesn't change anything.
> > 
> > Hope you have an idea where to look. Providing captures/logs is a bit
> > difficult without some careful scrubbing..
> > 
> > Regards,
> > PiBa-NL (Pieter)
> > 
> 
> No answer yet, but I just tried to do a checkout of the FreeBSD svn tree
> via haproxy, and it indeed doesn't work, it's a bit late right now, but
> I'll have a look tomorrow :)
> 

I think I figured it out, and commit 69664419d209d2f64dbcd90f991e7ec2efff2ee2
should fix it, at least now I can do a full svn checkout.

Can you confirm it does the trick for you ?

Thanks !

Olivier



Re: commit 493d9dc makes a SVN-checkout stall..

2020-03-25 Thread Olivier Houchard
Hi Willy,

On Wed, Mar 25, 2020 at 02:03:56PM +0100, Willy Tarreau wrote:
> On Wed, Mar 25, 2020 at 12:40:33PM +0100, Olivier Houchard wrote:
> > I think I figured it out, and commit 
> > 69664419d209d2f64dbcd90f991e7ec2efff2ee2
> > should fix it, at least now I can do a full svn checkout.
> > 
> > Can you confirm it does the trick for you ?
> 
> Olivier, I think you made a last minute change after your fix because
> it broke the build:
> 
> https://travis-ci.com/github/haproxy/haproxy/jobs/302517378
> 
>   src/mux_h1.c:2495:23: error: incompatible pointer types passing 'struct 
> connection **' to parameter of type 'const struct buffer *' 
> [-Werror,-Wincompatible-pointer-types]
> if (unlikely(b_data(&h1c->conn)))
> ^~
>   include/common/compiler.h:98:40: note: expanded from macro 'unlikely'
>   #define unlikely(x) (__builtin_expect((x) != 0, 0))
>^
>   include/common/buf.h:100:50: note: passing argument to parameter 'b' here
>   static inline size_t b_data(const struct buffer *b)
>  ^
> I think it was h1c->dbuf not conn.
> 
> Willy

That is... interesting, not sure I reached such an outstanding result.

This is now fixed, sorry about that !

Olivier



Re: Weird issues with UNIX-Sockets on 2.1.x

2020-03-27 Thread Olivier Houchard
Hi Christian,

On Fri, Mar 27, 2020 at 02:37:41PM +0100, Christian Ruppert wrote:
> Hi list,
> 
> we have some weird issues now, the second time, that *some* SSL sockets 
> seem to be broken as well as stats sockets.
> HTTP seems to work fine, still, SSL ones are broken however. It happened 
> at least on 2.1.3 and *perhaps* on 2.1.2 as well. We're not sure whether 
> the first time was on 2.1.2 or 2.1.3.
> The one that failed today was updated yesterday, so HAProxy has an 
> uptime of about 24h.
> We're using threads. default + HTTP is using 1 thread, 1 is dedicated 
> for a TCP listener/Layer-4, one is for RSA only and all the rest is for 
> ECC.
[...]
> The problem ocurred arount 13:40 (CET, in case it matters at some point)
> 
> Any ideas so far?
> 

So basically, it used to work, and suddenly you get errors on any TLS
connection ?
If you still have the TCP stat socket working, can you show the output
of "show fd" ?

Thanks !

Olivier



Re: Weird issues with UNIX-Sockets on 2.1.x

2020-03-27 Thread Olivier Houchard
On Fri, Mar 27, 2020 at 04:21:20PM +0100, Christian Ruppert wrote:
> During the reload I just found something in the daemon log:
> Mar 27 13:37:54 somelb haproxy[20799]: [ALERT] 086/133748 (20799) : 
> Starting proxy someotherlistener: cannot bind socket [0.0.0.0:18540]
> Mar 27 13:37:54 somelb haproxy[20799]: [ALERT] 086/133748 (20799) : 
> Starting proxy someotherlistener: cannot bind socket [:::18540]
> 
> So during the reload, this happened and seems to have caused any further 
> issues/trouble.
> 

That would make sense. Does that mean you have old processes hanging
around ? Do you use seemless reload ? If so, it shouldn't attempt to
bind the socket, but get them from the old process.

Regards,

Olivier



Re: Weird issues with UNIX-Sockets on 2.1.x

2020-03-27 Thread Olivier Houchard
On Fri, Mar 27, 2020 at 04:32:21PM +0100, Christian Ruppert wrote:
> On 2020-03-27 16:27, Olivier Houchard wrote:
> > On Fri, Mar 27, 2020 at 04:21:20PM +0100, Christian Ruppert wrote:
> >> During the reload I just found something in the daemon log:
> >> Mar 27 13:37:54 somelb haproxy[20799]: [ALERT] 086/133748 (20799) :
> >> Starting proxy someotherlistener: cannot bind socket [0.0.0.0:18540]
> >> Mar 27 13:37:54 somelb haproxy[20799]: [ALERT] 086/133748 (20799) :
> >> Starting proxy someotherlistener: cannot bind socket [:::18540]
> >> 
> >> So during the reload, this happened and seems to have caused any 
> >> further
> >> issues/trouble.
> >> 
> > 
> > That would make sense. Does that mean you have old processes hanging
> > around ? Do you use seemless reload ? If so, it shouldn't attempt to
> > bind the socket, but get them from the old process.
> 
> I remember that it was necessary to have a systemd wrapper around, as it 
> caused trouble otherwise, due to PID being changed etc.
> Not sure if that wrapper is still in use. In this case it's systemd 
> though.
> [Unit]
> Description=HAProxy Load Balancer
> After=network.target
> 
> [Service]
> Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
> ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q
> ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE
> ExecReload=/usr/sbin/haproxy -f $CONFIG -c -q
> ExecReload=/bin/kill -USR2 $MAINPID
> KillMode=mixed
> Restart=always
> SuccessExitStatus=143
> TimeoutStopSec=30
> Type=notify

[...]

> We've added the TimeoutStopSec=30 for some reason (I'd have to ask my 
> college, something took longer or something like that, since we have 
> quite a lot of frontends/listener/backend)
> Only the two processes I mentioned before are / were running. Seems like 
> the fallback didn't work properly?
> 

The wrapper is no longer needed, it has been superceeded by the
master-worker (which you seem to be using, given you're using -Ws).
It is possible the old process refuse to die, and you end up hitting the
timeout and it gets killed eventually, but it's too late.
Do you have a expose-fd listeners on the unix stats socket ? Using it
will allow the new process to connect to the old process' stats socket,
and get all the listening sockets, so that it won't have to bind them.

Regards,

Olivier



Re: [PATCH 0/1] fix idle connections cleanup

2020-05-02 Thread Olivier Houchard
Hi William,

On Sat, May 02, 2020 at 09:52:35PM +0200, William Dauchy wrote:
> Hello Olivier,
> 
> The following patch is an attempt to fix a change of behavior we encounter
> following commit:
> 
> 079cb9af22da6 ("MEDIUM: connections: Revamp the way idle connections are 
> killed")
> being the origin of this new behaviour.
> 
> All details are in the commit message itself.
> The approach is most likely not the good one but feel free to advise so
> I may improve it.
> 
This is a good catch !

And I think your fix is correct, so I will apply it as-is. However, I
think it should be applied to previous branches, that also had idle
connection pools. It is probably less noticable before 079cb9af22da6,
because before that patch, one thread couldn't reuse a connection owned
by another thread, so to reproduce it you'd have to test it until you
hit the same thread twice, but I don't see why it wouldn't be an issue
tgere

Thanks a lot !

Olivier



Re: [PATCH 1/1] BUG/MEDIUM: connections: force connections cleanup on server changes

2020-05-02 Thread Olivier Houchard
On Sat, May 02, 2020 at 10:14:15PM +0200, William Dauchy wrote:
> On Sat, May 02, 2020 at 09:52:36PM +0200, William Dauchy wrote:
> > +void srv_cleanup_connections(struct server *srv)
> > +{
> > +   struct connection *conn;
> > +   int did_remove;
> > +   int i;
> > +   int j;
> > +
> > +   HA_SPIN_LOCK(OTHER_LOCK, &idle_conn_srv_lock);
> > +   for (i = 0; i < global.nbthread; i++) {
> > +   did_remove = 0;
> > +   HA_SPIN_LOCK(OTHER_LOCK, &toremove_lock[i]);
> > +   for (j = 0; j < srv->curr_idle_conns; j++) {
> > +   conn = MT_LIST_POP(&srv->idle_conns[i], struct 
> > connection *, list);
> > +   if (!conn)
> > +   conn = MT_LIST_POP(&srv->safe_conns[i],
> > +  struct connection *, list);
> > +   if (!conn)
> > +   break;
> 
> just realised I forgot:
>   did_remove = 1;
> 
> but will wait before sending a possible v2 with other feedbacks.

If that's the only change you have for a v2, don't bother, I already
integrated it in what I plan to push :)

Regards,

Olivier



Re: [PATCH 1/1] BUG/MEDIUM: connections: force connections cleanup on server changes

2020-05-02 Thread Olivier Houchard
On Sat, May 02, 2020 at 10:23:05PM +0200, William Dauchy wrote:
> On Sat, May 02, 2020 at 10:17:01PM +0200, Olivier Houchard wrote:
> > If that's the only change you have for a v2, don't bother, I already
> > integrated it in what I plan to push :)
> 
> ok, thanks a lot!
> 

It's now pushed, so any fix for the fix is now welcome :)

Thanks a lot !

Olivier



Re: [PATCH] CLEANUP: connections: align function declaration

2020-05-04 Thread Olivier Houchard
Hi William,

On Mon, May 04, 2020 at 02:08:54PM +0200, William Dauchy wrote:
> On Mon, May 04, 2020 at 01:56:10PM +0200, William Dauchy wrote:
> > Sorry for this one, I originally did update the server.h file, but I
> > don't know why I forgot to add it in my patch submission :/
> > then I saw you fixed it while pushing my commit, thank you for this.
> 
> I'm talking too fast, the .h was correctly sent.
> in fact it was a indeed a mistake added in between while pushing the
> fix. I lost myself between the different version here and there.
> Anyway :)

You did no wrong, this is actually my fault, I decided to make the
function static, as there were no need to export the symbol, so I
removed the changes from server.h, and added a static prototype at the
beginning of server.c, but quite obviously I forgot to add "static" to
the function definition :)
Your patch is now applied, thanks !

Olivier



Re: compile issues on FreeBSD/i386

2020-06-20 Thread Olivier Houchard
Hi Dmitry,

On Sat, Jun 20, 2020 at 03:30:55PM +0300, Dmitry Sivachenko wrote:
> Hello!
> 
> I am trying to compile haproxy-2.2-dev10 on FreeBSD-12/i386 (i386 is 
> important here) with clang version 9.0.1.
> 
> I get the following linker error:
> 
>   LD  haproxy
> ld: error: undefined symbol: __atomic_fetch_add_8
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server)
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server)
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server_and_queue)
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server_and_queue)
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server_and_queue)
> >>> referenced by backend.c
> >>>   src/backend.o:(assign_server_and_queue)
> >>> referenced by backend.c
> >>>   src/backend.o:(connect_server)
> >>> referenced by backend.c
> >>>   src/backend.o:(connect_server)
> >>> referenced by backend.c
> >>>   src/backend.o:(connect_server)
> >>> referenced by backend.c
> >>>   src/backend.o:(srv_redispatch_connect)
> >>> referenced 233 more times
> 
> ld: error: undefined symbol: __atomic_store_8
> 
> For some time we apply the following patch to build on FreeBSD/i386:
> 
> --- include/common/hathreads.h.orig 2018-02-17 18:17:22.21940 +
> +++ include/common/hathreads.h  2018-02-17 18:18:44.598422000 +
> @@ -104,7 +104,7 @@ extern THREAD_LOCAL unsigned long tid_bit; /* The bit 
>  /* TODO: thread: For now, we rely on GCC builtins but it could be a good 
> idea to
>   * have a header file regrouping all functions dealing with threads. */
>  
> -#if defined(__GNUC__) && (__GNUC__ < 4 || __GNUC__ == 4 && __GNUC_MINOR__ < 
> 7) && !defined(__clang__)
> +#if (defined(__GNUC__) && (__GNUC__ < 4 || __GNUC__ == 4 && __GNUC_MINOR__ < 
> 7) && !defined(__clang__)) || (defined(__clang__) && defined(__i386__))
>  /* gcc < 4.7 */
>  
>  #define HA_ATOMIC_ADD(val, i)__sync_add_and_fetch(val, i)
> 
> (it is from older -dev but still applies to include/haproxy/atomic.h and 
> fixes the build).
> 
> If this patch is correct for i386, may be we include it to haproxy sources?
 
The problem seems to be that clang assumes that the atomic access may be
done unaligned. Fixing most of it is pretty straight forward, it mostly
involves adding __attribute__((aligned(64))) on all the relevant
structure members. However, no amount of cajoling helps for trace.c, as
long as we use ev_ptr, it always assumes it may be unaligned.
Even more funny, while
HA_ATOMIC_STORE(&src->report_events, 0); will work if report_events is 
flagged as 64bits-aligned, HA_ATOMIC_STORE(ev_ptr == &src->report_events
? &src->report_events : &src->report_events, 0) will make clang unhappy.
So I don't know how to handle that properly. Willy, I think Dmitry's
solution of using the old builtins for clang/i386 might be the less
intrusive way of fixing it.

> PS:  with that patch applied I get the following warning which can have sense:
> 
> src/stick_table.c:3462:12: warning: result of comparison 'unsigned long' > 
> 4294967295 is always false [-Wtautological-type-limit-compare]
> val > 0x)
> ~~~ ^ ~~
> 
This one sounds mostly harmless, I guess whe should use a long long
here, and strtoull().

Olivier





Re: `ssl_fc_has_early` fetcher and 0rtt

2020-09-09 Thread Olivier Houchard
On Wed, Sep 09, 2020 at 05:35:28PM +0200, Willy Tarreau wrote:
> On Wed, Sep 09, 2020 at 04:57:58PM +0200, William Dauchy wrote:
> > > I think it's not easy to reproduce these tests, you need a high enough
> > > latency between haproxy and the client so that the handshake is not
> > > already completed when you evaluate the rule, and of course you need
> > > to make sure the client sends using early data. I don't remember how
> > > Olivier used to run his tests but I remember that it was a bit tricky,
> > > so it's very possible that you never fall into the situation where you
> > > can see the unvalidated early data yet.
> > 
> > It means my understanding of this fetcher was wrong indeed.
> > For me the protection was here:
> >   http-request wait-for-handshake if ! METH_GET
> > and the fetcher here to log whether it was a 0rtt request or not.
> > In reality, it means all our requests have completed the handshake
> > when the rule is evaluated (which is surprising looking at the number
> > we have).
> 
> That seems strange indeed but looking at the code that's what I'm
> seeing. Was your access to ssl_fc_has_early placed before or after the
> rule above ? If it's after it must indeed report false.
> 
> > So maybe we can possibly work on an alternative fetcher to know
> > whether this was a 0rtt request? Or is there another way?
> 
> I seem to remember there was one but can't find it, so I may have been
> confused. With this said, it doesn't provide a big information since
> once the handshake is completed, it's exactly identical to a regular
> one. But it can be nice for statistics at least.
> 

Pretty sure the original version always returned 1 if there were early
data, but we changed it so that it would return 0 once the handshake was
done, as we thought it was more useful, that may be what you're thinking
about.

Olivier



Dynamic cookies support

2017-03-14 Thread Olivier Houchard
Hi guys,

You'll find attached patches to add support for dynamically-generated session
cookies for each server, the said cookies will be a hash of the IP, the
TCP port, and a secret key provided.
This adds 2 keywords to the config file, a "dynamic" keyword in the cookie
line, which just enables dynamic cookie, and a "dynamic-cookie-key" keyword,
for backends, which specifies the secret key to use.
This is a first step to be able to add and remove servers on the fly, 
without modifying the configuration. That way, we can have cookies that are
automatically usable for all the load-balancers.

Any comment would be welcome.

Thanks !

Olivier
>From a29344438de3777ab692978b5195adfd100f219f Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 14 Mar 2017 20:01:29 +0100
Subject: [PATCH 1/2] MINOR: server: Add dynamic session cookies.
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

This adds a new "dynamic" keyword for the cookie option. If set, a cookie
will be generated for each server (assuming one isn't already provided on
the "server" line), from the IP of the server, the TCP port, and a secret
key provided. To provide the secret key, a new keyword as been added,
"dynamic-cookie-key", for backends.

Example :
backend bk_web
  balance roundrobin
  dynamic-cookie-key "bla"
  cookie WEBSRV insert dynamic
  server s1 127.0.0.1:80 check
  server s2 192.168.56.1:80 check

This is a first step to be able to dynamically add and remove servers,
without modifying the configuration file, and still have all the load
balancers redirect the traffic to the right server.

Provide a way to generate session cookies, based on the IP address of the
server, the TCP port, and a secret key provided.
---
 doc/configuration.txt  | 21 +
 include/proto/server.h |  5 +++
 include/types/proxy.h  |  2 ++
 include/types/server.h |  1 +
 src/cfgparse.c | 42 -
 src/server.c   | 85 --
 6 files changed, 152 insertions(+), 4 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index a505f70..5cc1bdd 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2771,6 +2771,7 @@ contimeout  (deprecated)
 cookie  [ rewrite | insert | prefix ] [ indirect ] [ nocache ]
   [ postonly ] [ preserve ] [ httponly ] [ secure ]
   [ domain  ]* [ maxidle  ] [ maxlife  ]
+  [ dynamic ]
   Enable cookie-based persistence in a backend.
   May be used in sections :   defaults | frontend | listen | backend
  yes   |no|   yes  |   yes
@@ -2922,6 +2923,13 @@ cookie  [ rewrite | insert | prefix ] [ indirect ] 
[ nocache ]
   is stronger than the maxidle method in that it forces a
   redispatch after some absolute delay.
 
+dynamic   Activate dynamic cookies. When used, a session cookie is 
+  dynamically created for each server, based on the IP and port
+  of the server, and a secret key, specified in the
+  "dynamic-cookie-key" backend directive.
+  The cookie will be regenerated each time the IP address change,
+  and is only generated for IPv4/IPv6.
+
   There can be only one persistence cookie per HTTP backend, and it can be
   declared in a defaults section. The value of the cookie will be the value
   indicated after the "cookie" keyword in a "server" statement. If no cookie
@@ -3042,6 +3050,19 @@ dispatch :
   See also : "server"
 
 
+dynamic-cookie-key 
+  Set the dynamic cookie secret key for a backend. 
+  May be used in sections :   defaults | frontend | listen | backend
+ yes   |no|   yes  |   yes
+  Arguments : The secret key to be used.
+
+  When dynamic cookies are enabled (see the "dynamic" directive for cookie),
+  a dynamic cookie is created for each server (unless one is explicitely
+  specified on the "server" line), using a hash of the IP address of the
+  server, the TCP port, and the secret key.
+  That way, we can ensure session persistence accross multiple load-balancers,
+  even if servers are dynamically added or removed.
+
 enabled
   Enable a proxy, frontend or backend.
   May be used in sections :   defaults | frontend | listen | backend
diff --git a/include/proto/server.h b/include/proto/server.h
index 7c9574e..6e3ccf3 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -204,6 +204,11 @@ void srv_set_admin_flag(struct server *s, enum srv_admin 
mode, const char *cause
  */
 void srv_clr_admin_flag(struct server *s, enum srv_admin mode);
 
+/* Calculates the dynamic persitent cookie for a server, if a secret key has
+ * been provided.
+ */
+void srv_set_dyncookie(struct server *s);
+
 /* Puts server  into maintenance mode, and 

Re: Dynamic cookies support

2017-03-15 Thread Olivier Houchard
On Wed, Mar 15, 2017 at 03:52:04PM +0200, Jarno Huuskonen wrote:
> Hi Olivier,
> 
> On Tue, Mar 14, Olivier Houchard wrote:
> > Hi guys,
> > 
> > You'll find attached patches to add support for dynamically-generated 
> > session
> > cookies for each server, the said cookies will be a hash of the IP, the
> > TCP port, and a secret key provided.
> > This adds 2 keywords to the config file, a "dynamic" keyword in the cookie
> > line, which just enables dynamic cookie, and a "dynamic-cookie-key" keyword,
> > for backends, which specifies the secret key to use.
> > This is a first step to be able to add and remove servers on the fly, 
> > without modifying the configuration. That way, we can have cookies that are
> > automatically usable for all the load-balancers.
> > 
> > Any comment would be welcome.
> 
> If I omit dynamic-cookie-key then I'll get a crash:
> (gdb) bt
> #0  0x76a72e71 in __strlen_sse2_pminub () from /lib64/libc.so.6
> #1  0x0044bbe4 in srv_set_dyncookie (s=s@entry=0x720050)
> at src/server.c:88
> (I think p->dyncookie_key is NULL).
> 
> #2  0x00429720 in check_config_validity () at src/cfgparse.c:8480
> #3  0x00487b25 in init (argc=, argc@entry=4, 
> argv=, argv@entry=0x7fffdcb8) at src/haproxy.c:814
> #4  0x0040820b in main (argc=4, argv=0x7fffdcb8)
> at src/haproxy.c:1643
> 
> with this config(defaults/global omitted):
> frontend test
>   bind ipv4@127.0.0.1:8080
>   default_backend test_be
> 
> backend test_be
>   balance roundrobin
>   #dynamic-cookie-key "dymmykey"
>   cookie WEBSRV insert dynamic
>   server wp1 127.0.0.1:8084 id 1
>   server wp2 127.0.0.1:8085 id 2
> 


Oops my bad, I'm an idiot.
Willy, can you commit the attached patch ?

> Is the hash same with little/big endian processors ?
> memcpy(&(tmpbuf[key_len]),
> s->addr.ss_family == AF_INET ?
> (void *)&((struct sockaddr_in *)&s->addr)->sin_addr.s_addr :
> (void *)&(((struct sockaddr_in6 *)&s->addr)->sin6_addr.s6_addr),
> addr_len);
> 

I expect those to be stored in network byte order, so it should be.

> -Jarno
> 
> (Also there's a minor typo in comments:
> > +/* Calculates the dynamic persitent cookie for a server, if a secret key 
> > has
> > + * been provided.
> > + */
> ...
> > +   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
> > Dynamic persitent cookies secret key */
> 
> s/persitent/persistent/)
> 

Oops, patch #2 :)

Thanks a lot !

Olivier
>From f9a92fd9a624962860af9c61208d60e304e60a52 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 15 Mar 2017 15:11:06 +0100
Subject: [PATCH 1/2] BUG/MEDIUM server: Fix crash when dynamic is defined, but
 not key is provided.
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

Wait until we're sure we have a key before trying to calculate its length.
---
 src/server.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 4e03e50..5589723 100644
--- a/src/server.c
+++ b/src/server.c
@@ -85,7 +85,7 @@ void srv_set_dyncookie(struct server *s)
struct server *tmpserv;
char *tmpbuf;
unsigned long long hash_value;
-   size_t key_len = strlen(p->dyncookie_key);
+   size_t key_len;
size_t buffer_len;
int addr_len;
int port;
@@ -94,6 +94,7 @@ void srv_set_dyncookie(struct server *s)
    !(s->proxy->ck_opts & PR_CK_DYNAMIC) ||
s->proxy->dyncookie_key == NULL)
return;
+   key_len = strlen(p->dyncookie_key);
 
if (s->addr.ss_family != AF_INET &&
s->addr.ss_family != AF_INET6)
-- 
2.9.3

>From c576bb09e643b34798056c83182752e872e578c1 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 15 Mar 2017 15:12:06 +0100
Subject: [PATCH 2/2] MINOR: Typo in comment.
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

---
 src/cfgparse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/cfgparse.c b/src/cfgparse.c
index 5a09589..2eb25ed 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -3310,7 +3310,7 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
curproxy->cookie_maxlife = maxlife;
cur_arg++;
}
-   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
Dynamic persitent cookies secret key */
+   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
Dynamic persistent cookies secret key */
 
if (warnifnotcap(curproxy, PR_CAP_BE, file, 
linenum, args[cur_arg], NULL))
err_code |= ERR_WARN;
-- 
2.9.3



[PATCH] minor cleanup to the dynamic cookie code

2017-04-04 Thread Olivier Houchard
Hi guys,

Each time we generate a new dynamic cookie, we compare it with other
servers' cookies, to make sure we didn't generate that one, as unlikely as
it is. The existing code attempts at comparing it for every server of every
proxy (and fails, because it bogusly checks against the server list of
the first proxy at each iteration, making it a total waste of cycles), but
we only have to worry about the current proxy, so that's what the attached
patch does.
Willy, I think it is mostly safe and you can apply it.

Cheers,

Olivier
>From 3fe44f06b4ba964a71582c4460400b489f7dcf72 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 4 Apr 2017 22:10:36 +0200
Subject: [PATCH] MINOR server: Restrict dynamic cookie check to the same
 proxy.

Each time we generate a dynamic cookie, we try to make sure the same
cookie hasn't been generated for another server, it's very unlikely, but
it may happen.
We only have to check that for the servers in the same proxy, no, need to
check in others, plus the code was buggy and would always check in the
first proxy of the proxy list.
---
 src/server.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/server.c b/src/server.c
index 23343d86..6070de93 100644
--- a/src/server.c
+++ b/src/server.c
@@ -135,18 +135,17 @@ void srv_set_dyncookie(struct server *s)
 * Check that we did not get a hash collision.
 * Unlikely, but it can happen.
 */
-   for (p = proxy; p != NULL; p = p->next)
-   for (tmpserv = proxy->srv; tmpserv != NULL;
-   tmpserv = tmpserv->next) {
-   if (tmpserv == s)
-   continue;
-   if (tmpserv->cookie &&
-   strcmp(tmpserv->cookie, s->cookie) == 0) {
-   Warning("We generated two equal cookies for two 
different servers.\n"
-   "Please change the secret key for '%s'.\n",
-   s->proxy->id);
-   }
+   for (tmpserv = p->srv; tmpserv != NULL;
+   tmpserv = tmpserv->next) {
+   if (tmpserv == s)
+   continue;
+   if (tmpserv->cookie &&
+   strcmp(tmpserv->cookie, s->cookie) == 0) {
+   Warning("We generated two equal cookies for two 
different servers.\n"
+   "Please change the secret key for '%s'.\n",
+   s->proxy->id);
}
+   }
 }
 
 /*
-- 
2.11.0



[RFC][PATCHES] seamless reload

2017-04-06 Thread Olivier Houchard
Hi,

The attached patchset is the first cut at an attempt to work around the
linux issues with SOREUSEPORT that makes haproxy refuse a few new connections
under heavy load.
This works by transferring the existing sockets to the new process via the
stats socket. A new command-line flag has been added, -x, that takes the
path to the unix socket as an argument, and if set, will attempt to retrieve
all the listening sockets;
Right now, any error, either while connecting to the socket, or retrieving
the file descriptors, is fatal, but maybe it'd be better to just fall back
to the previous behavior instead of opening any missing socket ? I'm still
undecided about that.

Any testing, comments, etc would be greatly appreciated.

Regards,

Olivier
>From f2a13d1ce2f182170f70fe3d5312a538788f5877 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 5 Apr 2017 22:24:59 +0200
Subject: [PATCH 1/6] MINOR: cli: Add a command to send listening sockets.

Add a new command that will send all the listening sockets, via the
stats socket, and their properties.
This is a first step to workaround the linux problem when reloading
haproxy.
---
 include/types/connection.h |   8 ++
 src/cli.c  | 179 +
 2 files changed, 187 insertions(+)

diff --git a/include/types/connection.h b/include/types/connection.h
index 5ce5e0c..9d1b51a 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -389,6 +389,14 @@ struct tlv_ssl {
 #define PP2_CLIENT_CERT_CONN 0x02
 #define PP2_CLIENT_CERT_SESS 0x04
 
+
+/*
+ * Linux seems to be able to send 253 fds per sendmsg(), not sure
+ * about the other OSes.
+ */
+/* Max number of file descriptors we send in one sendmsg() */
+#define MAX_SEND_FD 253
+
 #endif /* _TYPES_CONNECTION_H */
 
 /*
diff --git a/src/cli.c b/src/cli.c
index fa45db9..37fb9fe 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 #include 
 #include 
@@ -1013,6 +1015,182 @@ static int bind_parse_level(char **args, int cur_arg, 
struct proxy *px, struct b
return 0;
 }
 
+/* Send all the bound sockets, always returns 1 */
+static int _getsocks(char **args, struct appctx *appctx, void *private)
+{
+   char *cmsgbuf = NULL;
+   unsigned char *tmpbuf = NULL;
+   struct cmsghdr *cmsg;
+   struct stream_interface *si = appctx->owner;
+   struct connection *remote = objt_conn(si_opposite(si)->end);
+   struct msghdr msghdr;
+   struct iovec iov;
+   int *tmpfd;
+   int tot_fd_nb = 0;
+   struct proxy *px;
+   int i = 0;
+   int fd = remote->t.sock.fd;
+   int curoff = 0;
+   int old_fcntl;
+   int ret;
+
+   /* Temporary set the FD in blocking mode, that will make our life 
easier */
+   old_fcntl = fcntl(fd, F_GETFL);
+   if (old_fcntl < 0) {
+   Warning("Couldn't get the flags for the unix socket\n");
+   goto out;
+   }
+   cmsgbuf = malloc(CMSG_SPACE(sizeof(int) * MAX_SEND_FD));
+   if (!cmsgbuf) {
+   Warning("Failed to allocate memory to send sockets\n");
+   goto out;
+   }
+   if (fcntl(fd, F_SETFL, old_fcntl &~ O_NONBLOCK) == -1) {
+   Warning("Cannot make the unix socket blocking\n");
+   goto out;
+   }
+   iov.iov_base = &tot_fd_nb;
+   iov.iov_len = sizeof(tot_fd_nb);
+   if (!cli_has_level(appctx, ACCESS_LVL_ADMIN))
+   goto out;
+   memset(&msghdr, 0, sizeof(msghdr));
+   /*
+* First, calculates the total number of FD, so that we can let
+* the caller know how much he should expects.
+*/
+   px = proxy;
+   while (px) {
+   struct listener *l;
+
+   list_for_each_entry(l, &px->conf.listeners, by_fe) {
+   /* Only transfer IPv4/IPv6 sockets */
+   if (l->proto->sock_family == AF_INET ||
+   l->proto->sock_family == AF_INET6)
+   tot_fd_nb++;
+   }
+   px = px->next;
+   }
+   if (tot_fd_nb == 0)
+   goto out;
+
+   /* First send the total number of file descriptors, so that the
+* receiving end knows what to expect.
+*/
+   msghdr.msg_iov = &iov;
+   msghdr.msg_iovlen = 1;
+   ret = sendmsg(fd, &msghdr, 0);
+   if (ret != sizeof(tot_fd_nb)) {
+   Warning("Failed to send the number of sockets to send\n");
+   goto out;
+   }
+
+   /* Now send the fds */
+   msghdr.msg_control = cmsgbuf;
+   msghdr.msg_controllen = CMSG_SPACE(sizeof(int) * MAX_SEND_FD);
+   cmsg = CMSG_FIRSTHDR(&msghdr);
+   cmsg->cmsg_len = CMSG_LEN(MAX_SEND_FD * sizeof(int));
+   cmsg->cmsg_level = SOL_SOCKET;
+  

Re: [RFC][PATCHES] seamless reload

2017-04-06 Thread Olivier Houchard
On Thu, Apr 06, 2017 at 04:56:47PM +0200, Pavlos Parissis wrote:
> On 06/04/2017 04:25 μμ, Olivier Houchard wrote:
> > Hi,
> > 
> > The attached patchset is the first cut at an attempt to work around the
> > linux issues with SOREUSEPORT that makes haproxy refuse a few new 
> > connections
> > under heavy load.
> > This works by transferring the existing sockets to the new process via the
> > stats socket. A new command-line flag has been added, -x, that takes the
> > path to the unix socket as an argument, and if set, will attempt to retrieve
> > all the listening sockets;
> > Right now, any error, either while connecting to the socket, or retrieving
> > the file descriptors, is fatal, but maybe it'd be better to just fall back
> > to the previous behavior instead of opening any missing socket ? I'm still
> > undecided about that.
> > 
> > Any testing, comments, etc would be greatly appreciated.
> > 
> 
> Does this patch set support HAProxy in multiprocess mode (nbproc > 1) ?
> 

Hi Pavlos,

If it does not, it's a bug :)
In my few tests, it seemed to work.

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-07 Thread Olivier Houchard
On Fri, Apr 07, 2017 at 09:58:57PM +0200, Pavlos Parissis wrote:
> On 06/04/2017 04:57 ????, Olivier Houchard wrote:
> > On Thu, Apr 06, 2017 at 04:56:47PM +0200, Pavlos Parissis wrote:
> >> On 06/04/2017 04:25 , Olivier Houchard wrote:
> >>> Hi,
> >>>
> >>> The attached patchset is the first cut at an attempt to work around the
> >>> linux issues with SOREUSEPORT that makes haproxy refuse a few new 
> >>> connections
> >>> under heavy load.
> >>> This works by transferring the existing sockets to the new process via the
> >>> stats socket. A new command-line flag has been added, -x, that takes the
> >>> path to the unix socket as an argument, and if set, will attempt to 
> >>> retrieve
> >>> all the listening sockets;
> >>> Right now, any error, either while connecting to the socket, or retrieving
> >>> the file descriptors, is fatal, but maybe it'd be better to just fall back
> >>> to the previous behavior instead of opening any missing socket ? I'm still
> >>> undecided about that.
> >>>
> >>> Any testing, comments, etc would be greatly appreciated.
> >>>
> >>
> >> Does this patch set support HAProxy in multiprocess mode (nbproc > 1) ?
> >>
> > 
> > Hi Pavlos,
> > 
> > If it does not, it's a bug :)
> > In my few tests, it seemed to work.
> > 
> > Olivier
> > 
> 
> 
> I run systems with systemd and I can't see how I can test the seamless reload
> functionality ( thanks for that) without a proper support for the UNIX socket
> file argument (-x) in the haproxy-systemd-wrapper.
> 
> I believe you need to modify the wrapper to accept -x argument for a single
> UNIX socket file or -X for a directory path with multiple UNIX socket files,
> when HAProxy runs in multi-process mode.
> 
> What do you think?
> 
> Cheers,
> Pavlos
> 
> 
> 


Hi Pavlos,

I didn't consider systemd, so it may be I have to investigate there.
You don't need to talk to all the processes to get the sockets, in the new
world order, each process does have all the sockets, although it will accept
connections only for those for which it is configured to do so (I plan to add
a configuration option to restore the old behavior, for those who don't need 
that, and want to save file descriptors).
Reading the haproxy-systemd-wrapper code, it should be trivial.
I just need to figure out how to properly provide the socket path to the
 wrapper.
I see that you already made use of a few environment variables in
haproxy.service. Would that be reasonnable to add a new one, that could
be set in haproxy.service.d/overwrite.conf ? I'm not super-used to systemd,
and I'm not sure of how people are doing that kind of things.

Thanks !

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-10 Thread Olivier Houchard

Hi,

On top of those patches, here a 3 more patches.
The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET
environment variable. If set, it will use that as an argument to -x, when
reloading the process.
The second one sends listening unix sockets, as well as IPv4/v6 sockets. 
I see no reason not to, and that means we no longer have to wait until
the old process close the socket before being able to accept new connections
on it.
The third one adds a new global optoin, nosockettransfer, if set, we assume
we will never try to transfer listening sockets through the stats socket,
and close any socket nout bound to our process, to save a few file
descriptors.

Regards,

Olivier
>From 8d6c38b6824346b096ba31757ab62bc986a433b3 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Sun, 9 Apr 2017 16:28:10 +0200
Subject: [PATCH 7/9] MINOR: systemd wrapper: add support for passing the -x
 option.

Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set.
If set, it will use it as an argument to the "-x" option, which makes
haproxy asks for any listening socket, on the stats socket, in order
to achieve reloads with no new connection lost.
---
 contrib/systemd/haproxy.service.in |  2 ++
 src/haproxy-systemd-wrapper.c  | 10 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/contrib/systemd/haproxy.service.in 
b/contrib/systemd/haproxy.service.in
index dca81a2..05bb716 100644
--- a/contrib/systemd/haproxy.service.in
+++ b/contrib/systemd/haproxy.service.in
@@ -3,6 +3,8 @@ Description=HAProxy Load Balancer
 After=network.target
 
 [Service]
+# You can point the environment variable HAPROXY_STATS_SOCKET to a stats
+# socket if you want seamless reloads.
 Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
 ExecStartPre=@SBINDIR@/haproxy -f $CONFIG -c -q
 ExecStart=@SBINDIR@/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE
diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index f6a9c85..1d00111 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -92,11 +92,15 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
pid = fork();
if (!pid) {
char **argv;
+   char *stats_socket = NULL;
int i;
int argno = 0;
 
/* 3 for "haproxy -Ds -sf" */
-   argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *));
+   if (nb_pid > 0)
+   stats_socket = getenv("HAPROXY_STATS_SOCKET");
+   argv = calloc(4 + main_argc + nb_pid + 1 +
+   stats_socket != NULL ? 2 : 0, sizeof(char *));
if (!argv) {
fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: 
failed to calloc(), please try again later.\n");
exit(1);
@@ -121,6 +125,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
argv[argno++] = "-sf";
for (i = 0; i < nb_pid; ++i)
argv[argno++] = pid_strv[i];
+   if (stats_socket != NULL) {
+   argv[argno++] = "-x";
+   argv[argno++] = stats_socket;
+   }
}
argv[argno] = NULL;
 
-- 
2.9.3

>From df5e6e70f2e73fca9e28ba273904ab5c5acf53d3 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Sun, 9 Apr 2017 19:17:15 +0200
Subject: [PATCH 8/9] MINOR: cli: When sending listening sockets, send unix
 sockets too.

Send unix sockets, as well as IPv4/IPv6 sockets, so that we don't have to
wait for the old process to die before being able to bind those.
---
 src/cli.c|  6 --
 src/proto_uxst.c | 50 ++
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/src/cli.c b/src/cli.c
index d5ff11f..533f792 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -1067,7 +1067,8 @@ static int _getsocks(char **args, struct appctx *appctx, 
void *private)
list_for_each_entry(l, &px->conf.listeners, by_fe) {
/* Only transfer IPv4/IPv6 sockets */
if (l->proto->sock_family == AF_INET ||
-   l->proto->sock_family == AF_INET6)
+   l->proto->sock_family == AF_INET6 ||
+   l->proto->sock_family == AF_UNIX)
tot_fd_nb++;
}
px = px->next;
@@ -1120,7 +1121,8 @@ static int _getsocks(char **args, struct appctx *appctx, 
void *private)
/* Only transfer IPv4/IPv6 sockets */
if (l->state >= LI_LISTEN &&
(l->proto->sock_family == AF_INET ||
-

Re: [RFC][PATCHES] seamless reload

2017-04-10 Thread Olivier Houchard
On Mon, Apr 10, 2017 at 10:49:21PM +0200, Pavlos Parissis wrote:
> On 07/04/2017 11:17 ????, Olivier Houchard wrote:
> > On Fri, Apr 07, 2017 at 09:58:57PM +0200, Pavlos Parissis wrote:
> >> On 06/04/2017 04:57 , Olivier Houchard wrote:
> >>> On Thu, Apr 06, 2017 at 04:56:47PM +0200, Pavlos Parissis wrote:
> >>>> On 06/04/2017 04:25 , Olivier Houchard wrote:
> >>>>> Hi,
> >>>>>
> >>>>> The attached patchset is the first cut at an attempt to work around the
> >>>>> linux issues with SOREUSEPORT that makes haproxy refuse a few new 
> >>>>> connections
> >>>>> under heavy load.
> >>>>> This works by transferring the existing sockets to the new process via 
> >>>>> the
> >>>>> stats socket. A new command-line flag has been added, -x, that takes the
> >>>>> path to the unix socket as an argument, and if set, will attempt to 
> >>>>> retrieve
> >>>>> all the listening sockets;
> >>>>> Right now, any error, either while connecting to the socket, or 
> >>>>> retrieving
> >>>>> the file descriptors, is fatal, but maybe it'd be better to just fall 
> >>>>> back
> >>>>> to the previous behavior instead of opening any missing socket ? I'm 
> >>>>> still
> >>>>> undecided about that.
> >>>>>
> >>>>> Any testing, comments, etc would be greatly appreciated.
> >>>>>
> >>>>
> >>>> Does this patch set support HAProxy in multiprocess mode (nbproc > 1) ?
> >>>>
> >>>
> >>> Hi Pavlos,
> >>>
> >>> If it does not, it's a bug :)
> >>> In my few tests, it seemed to work.
> >>>
> >>> Olivier
> >>>
> >>
> >>
> >> I run systems with systemd and I can't see how I can test the seamless 
> >> reload
> >> functionality ( thanks for that) without a proper support for the UNIX 
> >> socket
> >> file argument (-x) in the haproxy-systemd-wrapper.
> >>
> >> I believe you need to modify the wrapper to accept -x argument for a single
> >> UNIX socket file or -X for a directory path with multiple UNIX socket 
> >> files,
> >> when HAProxy runs in multi-process mode.
> >>
> >> What do you think?
> >>
> >> Cheers,
> >> Pavlos
> >>
> >>
> >>
> > 
> > 
> > Hi Pavlos,
> > 
> > I didn't consider systemd, so it may be I have to investigate there.
> > You don't need to talk to all the processes to get the sockets, in the new
> > world order, each process does have all the sockets, although it will accept
> > connections only for those for which it is configured to do so (I plan to 
> > add
> > a configuration option to restore the old behavior, for those who don't 
> > need 
> > that, and want to save file descriptors).
> > Reading the haproxy-systemd-wrapper code, it should be trivial.
> > I just need to figure out how to properly provide the socket path to the
> >  wrapper.
> > I see that you already made use of a few environment variables in
> > haproxy.service. Would that be reasonnable to add a new one, that could
> > be set in haproxy.service.d/overwrite.conf ? I'm not super-used to systemd,
> > and I'm not sure of how people are doing that kind of things.
> > 
> 
> I believe you are referring to $CONFIG and PIDFILE environment variables. 
> Those
> two variables are passed to the two arguments, which were present but 
> impossible
> to adjust their input, switching to variables allowed people to overwrite 
> their input.
> 
> In this case, we are talking about a new functionality I guess the best 
> approach
> would be to have ExecStart using EXTRAOPTS:
> ExecStart=/usr/sbin/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE $EXTRAOPTS
> 
> This will allow users to set a value to the new argument and any other 
> argument
> they want
> cat /etc/systemd/system/haproxy.service.d/overwrite.conf
> [Service]
> Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
> "EXTRAOPTS=-x /foobar"
> 
> or using default configuration file /etc/default/haproxy
> 
> [Service]
> Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
> EnvironmentFile=-/etc/default/haproxy
> ExecStart=/usr/sbin/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE $EXTRAOPTS
> 
> cat /etc/default/haproxy
> EXTRAOPTS="-x /foobar"
> 
> I hope it helps,
> Cheers,
> 



Hi Pavlos,

Yeah I see what you mean, it is certainly doable, though -x is a bit special,
because you don't use it the first time you run haproxy, only for reloading,
so that would mean the wrapper has special knowledge about it, and remove it
from the user-supplied command line the first time it's called. I'm a bit
uneasy about that, but if it's felt the best way to do it, I'll go ahead.

Regards,

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-10 Thread Olivier Houchard
On Mon, Apr 10, 2017 at 11:08:56PM +0200, Pavlos Parissis wrote:
> On 10/04/2017 08:09 ????, Olivier Houchard wrote:
> > 
> > Hi,
> > 
> > On top of those patches, here a 3 more patches.
> > The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET
> > environment variable. If set, it will use that as an argument to -x, when
> > reloading the process.
> 
> I see you want to introduce a specific environment variable for this 
> functionality
> and then fetched in the code with getenv(). This is a way to do it.
> 
> IMHO: I prefer to pass a value to an argument, for instance -x. It is also
> consistent with haproxy binary where someone uses -x argument as well.
> 

I'm not sure what you mean by this ?
It is supposed to be directly the value given to -x as an argument.

> > The second one sends listening unix sockets, as well as IPv4/v6 sockets. 
> > I see no reason not to, and that means we no longer have to wait until
> > the old process close the socket before being able to accept new connections
> > on it.
> 
> > The third one adds a new global optoin, nosockettransfer, if set, we assume
> > we will never try to transfer listening sockets through the stats socket,
> > and close any socket nout bound to our process, to save a few file
> > descriptors.
> > 
> 
> IMHO: a better name would be 'stats nounsedsockets', as it is referring to a
> generic functionality of UNIX stats socket, rather to a very specific 
> functionality.
> 

Well really it is a global setting, maybe my wording isn't great, but it
makes haproxy close all sockets not bound to this process, as it used to,
instead of keeping them around in case somebody asks for them. 

> I hope tomorrow I will find some time to test your patches.
> 

Thanks a lot ! This is greatly appreciated.

Regards,

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-11 Thread Olivier Houchard
On Tue, Apr 11, 2017 at 01:23:42PM +0200, Pavlos Parissis wrote:
> On 10/04/2017 11:52 μμ, Olivier Houchard wrote:
> > On Mon, Apr 10, 2017 at 11:08:56PM +0200, Pavlos Parissis wrote:
> >> On 10/04/2017 08:09 , Olivier Houchard wrote:
> >>>
> >>> Hi,
> >>>
> >>> On top of those patches, here a 3 more patches.
> >>> The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET
> >>> environment variable. If set, it will use that as an argument to -x, when
> >>> reloading the process.
> >>
> >> I see you want to introduce a specific environment variable for this 
> >> functionality
> >> and then fetched in the code with getenv(). This is a way to do it.
> >>
> >> IMHO: I prefer to pass a value to an argument, for instance -x. It is also
> >> consistent with haproxy binary where someone uses -x argument as well.
> >>
> > 
> > I'm not sure what you mean by this ?
> > It is supposed to be directly the value given to -x as an argument.
> > 
> 
> Ignore what I wrote above as I was under the wrong impression that we need to
> pass -x to systemd wrapper during the reload, but the systemd wrapper is not
> invoked during the reload.
> 
> 
> >>> The second one sends listening unix sockets, as well as IPv4/v6 sockets. 
> >>> I see no reason not to, and that means we no longer have to wait until
> >>> the old process close the socket before being able to accept new 
> >>> connections
> >>> on it.
> >>
> >>> The third one adds a new global optoin, nosockettransfer, if set, we 
> >>> assume
> >>> we will never try to transfer listening sockets through the stats socket,
> >>> and close any socket nout bound to our process, to save a few file
> >>> descriptors.
> >>>
> >>
> >> IMHO: a better name would be 'stats nounsedsockets', as it is referring to 
> >> a
> >> generic functionality of UNIX stats socket, rather to a very specific 
> >> functionality.
> >>
> > 
> > Well really it is a global setting, maybe my wording isn't great, but it
> > makes haproxy close all sockets not bound to this process, as it used to,
> > instead of keeping them around in case somebody asks for them. 
> > 
> >> I hope tomorrow I will find some time to test your patches.
> >>
> > 
> > Thanks a lot ! This is greatly appreciated.
> > 
> 
> Do you have a certain way to test it?
> My plan is to have a stress environment where I fire-up X thousands of TCP
> connections and produce a graph, which contain number of TCP RST received from
> the client during a soft-reload of haproxy. I'll run this test against haproxy
> 1.8 with and without your patches. So, I will have a clear indication if your
> patches solved the issue.
> 

That sounds good to me. My testing involved an haproxy running with 4
processes, 2 listeners, 1000 ports for each, and 2 processes bound to each
listener, 2 injectors doing http requests (that merely got a redirect as an
answer from haproxy), one on each listener, and reloading haproxy in a loop.

Regards,

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-11 Thread Olivier Houchard
On Tue, Apr 11, 2017 at 08:16:48PM +0200, Willy Tarreau wrote:
> Hi guys,
> 
> On Mon, Apr 10, 2017 at 11:08:56PM +0200, Pavlos Parissis wrote:
> > IMHO: a better name would be 'stats nounsedsockets', as it is referring to a
> > generic functionality of UNIX stats socket, rather to a very specific 
> > functionality.
> 
> Just two things on this :
>   - it's not directly related to the stats socket but it's a global
> behaviour of the process so it should not be under "stats" ;
> 
>   - please guys, don't stick words without a '-' as a delimitor
> anymore, we've made this mistake in the past with "forceclose",
> "httpclose" or "dontlognull" or whatever and some of them are
> really hard to read. That's why we now split the words using
> dashes, so that would be "no-unused-sockets" or I don't remember
> the other name Olivier proposed, but please use the same principle
> which leaves no ambiguity on how to parse it.

Fine, just beacuse it's you, I'll change that.

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
> Hi Olivier,
> 
> I was very eager to try out your patch set, thanks a lot! However, after
> applying all of them (including the last three), it seems that using
> environment variables in the config is broken (i.e. ${VARNAME} does not get
> replaced with the value of the environment variable anymore). I am using
> the systemd-wrapper, but not systemd.
> 
> Just from looking at the patches I couldn't make out what exactly might be
> the problem. I guess it could be either the wrapper not passing on its
> environment properly or haproxy not using it properly anymore. If you have
> any immediate ideas, let me know. I'll try to fully debug this when I can
> spare the time.
> 
> Thanks a lot,
> Conrad
> 

Hi Conrad,

You're right, that was just me being an idiot :)
Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
with the one attached, or just replace in src/haproxy-systemd-wrapper.c :

argv = calloc(4 + main_argc + nb_pid + 1 +
stats_socket != NULL ? 2 : 0, sizeof(char *));
by :
argv = calloc(4 + main_argc + nb_pid + 1 +
(stats_socket != NULL ? 2 : 0), sizeof(char *));

Regards,

Olivier
>From 526dca943b9cc89732c54bc43a6ce36e17b67890 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Sun, 9 Apr 2017 16:28:10 +0200
Subject: [PATCH 7/9] MINOR: systemd wrapper: add support for passing the -x
 option.

Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set.
If set, it will use it as an argument to the "-x" option, which makes
haproxy asks for any listening socket, on the stats socket, in order
to achieve reloads with no new connection lost.
---
 contrib/systemd/haproxy.service.in |  2 ++
 src/haproxy-systemd-wrapper.c  | 10 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/contrib/systemd/haproxy.service.in 
b/contrib/systemd/haproxy.service.in
index dca81a2..05bb716 100644
--- a/contrib/systemd/haproxy.service.in
+++ b/contrib/systemd/haproxy.service.in
@@ -3,6 +3,8 @@ Description=HAProxy Load Balancer
 After=network.target
 
 [Service]
+# You can point the environment variable HAPROXY_STATS_SOCKET to a stats
+# socket if you want seamless reloads.
 Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
 ExecStartPre=@SBINDIR@/haproxy -f $CONFIG -c -q
 ExecStart=@SBINDIR@/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE
diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index f6a9c85..457f5bd 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -92,11 +92,15 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
pid = fork();
if (!pid) {
char **argv;
+   char *stats_socket = NULL;
int i;
int argno = 0;
 
/* 3 for "haproxy -Ds -sf" */
-   argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *));
+   if (nb_pid > 0)
+   stats_socket = getenv("HAPROXY_STATS_SOCKET");
+   argv = calloc(4 + main_argc + nb_pid + 1 +
+   (stats_socket != NULL ? 2 : 0), sizeof(char *));
if (!argv) {
fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: 
failed to calloc(), please try again later.\n");
exit(1);
@@ -121,6 +125,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
argv[argno++] = "-sf";
for (i = 0; i < nb_pid; ++i)
argv[argno++] = pid_strv[i];
+   if (stats_socket != NULL) {
+   argv[argno++] = "-x";
+   argv[argno++] = stats_socket;
+   }
}
argv[argno] = NULL;
 
-- 
2.9.3



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> Hi again,
> 
> so I tried to get this to work, but didn't manage yet. I also don't quite
> understand how this is supposed to work. The first haproxy process is
> started _without_ the -x option, is that correct? Where does that instance
> ever create the socket for transfer to later instances?
> 
> I have it working now insofar that on reload, subsequent instances are
> spawned with the -x option, but they'll just complain that they can't get
> anything from the unix socket (because, for all I can tell, it's not
> there?). I also can't see the relevant code path where this socket gets
> created, but I didn't have time to read all of it yet.
> 
> Am I doing something wrong? Did anyone get this to work with the
> systemd-wrapper so far?
> 

You're right, the first one runs without -x. The socket used is just any
stats socket.

> Also, but this might be a coincidence, my test setup takes a huge
> performance penalty just by applying your patches (without any reloading
> whatsoever). Did this happen to anybody else? I'll send some numbers and
> more details tomorrow.
> 

Ah this is news to me, I did not expect a performances impact, can you share
your configuration file ?

Thanks !

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> Hi again,
> 
> so I tried to get this to work, but didn't manage yet. I also don't quite
> understand how this is supposed to work. The first haproxy process is
> started _without_ the -x option, is that correct? Where does that instance
> ever create the socket for transfer to later instances?
> 
> I have it working now insofar that on reload, subsequent instances are
> spawned with the -x option, but they'll just complain that they can't get
> anything from the unix socket (because, for all I can tell, it's not
> there?). I also can't see the relevant code path where this socket gets
> created, but I didn't have time to read all of it yet.
> 
> Am I doing something wrong? Did anyone get this to work with the
> systemd-wrapper so far?
> 
> Also, but this might be a coincidence, my test setup takes a huge
> performance penalty just by applying your patches (without any reloading
> whatsoever). Did this happen to anybody else? I'll send some numbers and
> more details tomorrow.
> 

Ok I can confirm the performance issues, I'm investigating.

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> > Hi again,
> > 
> > so I tried to get this to work, but didn't manage yet. I also don't quite
> > understand how this is supposed to work. The first haproxy process is
> > started _without_ the -x option, is that correct? Where does that instance
> > ever create the socket for transfer to later instances?
> > 
> > I have it working now insofar that on reload, subsequent instances are
> > spawned with the -x option, but they'll just complain that they can't get
> > anything from the unix socket (because, for all I can tell, it's not
> > there?). I also can't see the relevant code path where this socket gets
> > created, but I didn't have time to read all of it yet.
> > 
> > Am I doing something wrong? Did anyone get this to work with the
> > systemd-wrapper so far?
> > 
> > Also, but this might be a coincidence, my test setup takes a huge
> > performance penalty just by applying your patches (without any reloading
> > whatsoever). Did this happen to anybody else? I'll send some numbers and
> > more details tomorrow.
> > 
> 
> Ok I can confirm the performance issues, I'm investigating.
> 

Found it, I was messing with SO_LINGER when I shouldn't have been.

Can you try the new version of
0003-MINOR-tcp-When-binding-socket-attempt-to-reuse-one-f.patch
or replace, in src/proto_tcp.c :
if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 &&
(tmplinger.l_onoff == 0 || tmplinger.l_linger == 0)) {
/* Attempt to activate SO_LINGER, not sure what a sane
 * value is, using the default BSD value of 120s.
 */
tmplinger.l_onoff = 1;
tmplinger.l_linger = 120;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger,
sizeof(tmplinger));
}


by :
if (getsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger, &len) == 0 &&
(tmplinger.l_onoff == 1 || tmplinger.l_linger == 0)) {
tmplinger.l_onoff = 0;
tmplinger.l_linger = 0;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &tmplinger,
sizeof(tmplinger));
}
}


Thanks !

Olivier
>From 1bf2b9c550285b434c865adeb175277ce00c842b Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 5 Apr 2017 22:39:56 +0200
Subject: [PATCH 3/9] MINOR: tcp: When binding socket, attempt to reuse one
 from the old proc.

Try to reuse any socket from the old process, provided by the "-x" flag,
before binding a new one, assuming it is compatible.
"Compatible" here means same address and port, same namspace if any,
same interface if any, and that the following flags are the same :
LI_O_FOREIGN, LI_O_V6ONLY and LI_O_V4V6.
Also change tcp_bind_listener() to always enable/disable socket options,
instead of just doing so if it is in the configuration file, as the option
may have been removed, ie TCP_FASTOPEN may have been set in the old process,
and removed from the new configuration, so we have to disable it.
---
 src/proto_tcp.c | 119 +++-
 1 file changed, 117 insertions(+), 2 deletions(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index 5e12b99..ea6b8f7 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -731,6 +731,83 @@ int tcp_connect_probe(struct connection *conn)
return 0;
 }
 
+/* XXX: Should probably be elsewhere */
+static int compare_sockaddr(struct sockaddr_storage *a, struct 
sockaddr_storage *b)
+{
+   if (a->ss_family != b->ss_family) {
+   return (-1);
+   }
+   switch (a->ss_family) {
+   case AF_INET:
+   {
+   struct sockaddr_in *a4 = (void *)a, *b4 = (void *)b;
+   if (a4->sin_port != b4->sin_port)
+   return (-1);
+   return (memcmp(&a4->sin_addr, &b4->sin_addr,
+   sizeof(a4->sin_addr)));
+   }
+   case AF_INET6:
+   {
+   struct sockaddr_in6 *a6 = (void *)a, *b6 = (void *)b;
+   if (a6->sin6_port != b6->sin6_port)
+   return (-1);
+   return (memcmp(&a6->sin6_addr, &b6->sin6_addr,
+   sizeof(a6->sin6_addr)));
+   }
+   default:
+   return (-1);
+   }
+
+}
+
+#define LI_MANDATORY_FLAGS (LI_O_FOREIGN | LI_O_V6ONLY

Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
Yet another patch, on top of the previous ones.
This one tries to get the default value of TCP_MAXSEG by creating a temporary 
TCP socket, so that one can remove the "mss" entry from its configuration file,
 and reset the mss value for any transferred socket from the old process.

Olivier
>From 7dc2432f3a7c4a9e9531adafa4524a199e394f90 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 12 Apr 2017 19:32:15 +0200
Subject: [PATCH 10/10] MINOR: tcp: Attempt to reset TCP_MAXSEG when reusing a
 socket.

Guess the default value for TCP_MAXSEG by binding a temporary TCP socket and
getting it, and use that in case the we're reusing a socket from the old
process, and its TCP_MAXSEG is different. That way one can reset the
TCP_MAXSEG value to the default one when reusing sockets.
---
 src/proto_tcp.c | 55 ---
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index ea6b8f7..8050f3e 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -110,6 +110,12 @@ static struct protocol proto_tcpv6 = {
.nb_listeners = 0,
 };
 
+/* Default TCP parameters, got by opening a temporary TCP socket. */
+#ifdef TCP_MAXSEG
+static int default_tcp_maxseg = -1;
+static int default_tcp6_maxseg = -1;
+#endif
+
 /* Binds ipv4/ipv6 address  to socket , unless  is set, in 
which
  * case we try to bind .  is a 2-bit field consisting of :
  *  - 0 : ignore remote address (may even be a NULL pointer)
@@ -829,6 +835,35 @@ int tcp_bind_listener(struct listener *listener, char 
*errmsg, int errlen)
int ext, ready;
socklen_t ready_len;
const char *msg = NULL;
+#ifdef TCP_MAXSEG
+
+   /* Create a temporary TCP socket to get default parameters we can't
+* guess.
+* */
+   ready_len = sizeof(default_tcp_maxseg);
+   if (default_tcp_maxseg == -1) {
+   default_tcp_maxseg = -2;
+   fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+   if (fd < 0)
+   Warning("Failed to create a temporary socket!\n");
+   else {
+   if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
&default_tcp_maxseg,
+   &ready_len) == -1)
+   Warning("Failed to get the default value of 
TCP_MAXSEG\n");
+   }
+   }
+   if (default_tcp6_maxseg == -1) {
+   default_tcp6_maxseg = -2;
+   fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
+   if (fd >= 0) {
+   if (getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, 
&default_tcp6_maxseg,
+   &ready_len) == -1)
+   Warning("Failed ot get the default value of 
TCP_MAXSEG for IPv6\n");
+   close(fd);
+   }
+   }
+#endif
+
 
/* ensure we never return garbage */
if (errlen)
@@ -960,10 +995,24 @@ int tcp_bind_listener(struct listener *listener, char 
*errmsg, int errlen)
msg = "cannot set MSS";
err |= ERR_WARN;
}
+   } else if (ext) {
+   int tmpmaxseg = -1;
+   int defaultmss;
+   socklen_t len = sizeof(tmpmaxseg);
+
+   if (listener->addr.ss_family == AF_INET)
+   defaultmss = default_tcp_maxseg;
+   else
+   defaultmss = default_tcp6_maxseg;
+
+   getsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &tmpmaxseg, &len);
+   if (tmpmaxseg != defaultmss &&
+   setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG,
+   &defaultmss, sizeof(defaultmss)) == -1) {
+   msg = "cannot set MSS";
+   err |= ERR_WARN;
+   }
}
-   /* XXX: No else, the way to get the default MSS will vary from system
-* to system.
-*/
 #endif
 #if defined(TCP_USER_TIMEOUT)
if (listener->tcp_ut) {
-- 
2.9.3



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Olivier Houchard
On Wed, Apr 12, 2017 at 11:19:37AM -0700, Steven Davidovitz wrote:
> I had a problem testing it on Mac OS X, because cmsghdr is aligned to 4
> bytes. I changed the CMSG_ALIGN(sizeof(struct cmsghdr)) call to CMSG_LEN(0)
> to fix it.
> 

Oh right, I'll change that.
Thanks a lot !

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Olivier Houchard
On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
> Hi Olivier,
> 
> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
> > On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> >> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> >>> Hi again,
> >>>
> >>> so I tried to get this to work, but didn't manage yet. I also don't quite
> >>> understand how this is supposed to work. The first haproxy process is
> >>> started _without_ the -x option, is that correct? Where does that instance
> >>> ever create the socket for transfer to later instances?
> >>>
> >>> I have it working now insofar that on reload, subsequent instances are
> >>> spawned with the -x option, but they'll just complain that they can't get
> >>> anything from the unix socket (because, for all I can tell, it's not
> >>> there?). I also can't see the relevant code path where this socket gets
> >>> created, but I didn't have time to read all of it yet.
> >>>
> >>> Am I doing something wrong? Did anyone get this to work with the
> >>> systemd-wrapper so far?
> >>>
> >>> Also, but this might be a coincidence, my test setup takes a huge
> >>> performance penalty just by applying your patches (without any reloading
> >>> whatsoever). Did this happen to anybody else? I'll send some numbers and
> >>> more details tomorrow.
> >>>
> >>
> >> Ok I can confirm the performance issues, I'm investigating.
> >>
> > 
> > Found it, I was messing with SO_LINGER when I shouldn't have been.
> 
> 
> 
> thanks a lot, I can confirm that the performance regression seems to be gone!
> 
> I am still having the other (conceptual) problem, though. Sorry if this is
> just me holding it wrong or something, it's been a while since I dug
> through the internals of haproxy.
> 
> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
> which in turn starts haproxy in daemon mode, giving us a process tree like
> this (path and file names shortened for brevity):
> 
> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
> \_ /u/s/haproxy-master
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> 
> Now, in our config file, we have something like this:
> 
> # expose admin socket for each process
>   stats socket ${STATS_ADDR}   level admin process 1
>   stats socket ${STATS_ADDR}-2 level admin process 2
>   stats socket ${STATS_ADDR}-3 level admin process 3
>   stats socket ${STATS_ADDR}-4 level admin process 4
>   stats socket ${STATS_ADDR}-5 level admin process 5
>   stats socket ${STATS_ADDR}-6 level admin process 6
>   stats socket ${STATS_ADDR}-7 level admin process 7
>   stats socket ${STATS_ADDR}-8 level admin process 8
>   stats socket ${STATS_ADDR}-9 level admin process 9
>   stats socket ${STATS_ADDR}-10 level admin process 10
>   stats socket ${STATS_ADDR}-11 level admin process 11
>   stats socket ${STATS_ADDR}-12 level admin process 12
> 
> Basically, we have a dedicate admin socket for each ("real") process, as we
> need to be able to talk to each process individually. So I was wondering:
> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
> thought it would have to be a special stats socket in the haproxy-master
> process (which we currently don't have), but as far as I can tell from the
> output of `lsof` the haproxy-master process doesn't even hold any FDs
> anymore. Will this setup currently work with your patches at all? Do I need
> to add a stats socket to the master process? Or would this require a list
> of stats sockets to be passed, similar to the list of PIDs that gets passed
> to new haproxy instances, so that each process can talk to the one from
> which it is taking over the socket(s)? In case I need a stats socket for
> the master process, what would be the directive to create it?
> 

Hi Conrad,

Any of those sockets will do. Each process are made to keep all the 
listening sockets opened, even if the proxy is not bound to that specific
process, justly so that it can be transferred via the unix socket.

Regards,

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Olivier Houchard
On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
> >> Hi Olivier,
> >>
> >> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
> >>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> >>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> >>>>> Hi again,
> >>>>>
> >>>>> so I tried to get this to work, but didn't manage yet. I also don't 
> >>>>> quite
> >>>>> understand how this is supposed to work. The first haproxy process is
> >>>>> started _without_ the -x option, is that correct? Where does that 
> >>>>> instance
> >>>>> ever create the socket for transfer to later instances?
> >>>>>
> >>>>> I have it working now insofar that on reload, subsequent instances are
> >>>>> spawned with the -x option, but they'll just complain that they can't 
> >>>>> get
> >>>>> anything from the unix socket (because, for all I can tell, it's not
> >>>>> there?). I also can't see the relevant code path where this socket gets
> >>>>> created, but I didn't have time to read all of it yet.
> >>>>>
> >>>>> Am I doing something wrong? Did anyone get this to work with the
> >>>>> systemd-wrapper so far?
> >>>>>
> >>>>> Also, but this might be a coincidence, my test setup takes a huge
> >>>>> performance penalty just by applying your patches (without any reloading
> >>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
> >>>>> more details tomorrow.
> >>>>>
> >>>>
> >>>> Ok I can confirm the performance issues, I'm investigating.
> >>>>
> >>>
> >>> Found it, I was messing with SO_LINGER when I shouldn't have been.
> >>
> >> 
> >>
> >> thanks a lot, I can confirm that the performance regression seems to be 
> >> gone!
> >>
> >> I am still having the other (conceptual) problem, though. Sorry if this is
> >> just me holding it wrong or something, it's been a while since I dug
> >> through the internals of haproxy.
> >>
> >> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
> >> which in turn starts haproxy in daemon mode, giving us a process tree like
> >> this (path and file names shortened for brevity):
> >>
> >> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
> >> \_ /u/s/haproxy-master
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>
> >> Now, in our config file, we have something like this:
> >>
> >> # expose admin socket for each process
> >>   stats socket ${STATS_ADDR}   level admin process 1
> >>   stats socket ${STATS_ADDR}-2 level admin process 2
> >>   stats socket ${STATS_ADDR}-3 level admin process 3
> >>   stats socket ${STATS_ADDR}-4 level admin process 4
> >>   stats socket ${STATS_ADDR}-5 level admin process 5
> >>   stats socket ${STATS_ADDR}-6 level admin process 6
> >>   stats socket ${STATS_ADDR}-7 level admin process 7
> >>   stats socket ${STATS_ADDR}-8 level admin process 8
> >>   stats socket ${STATS_ADDR}-9 level admin process 9
> >>   stats socket ${STATS_ADDR}-10 level admin process 10
> >>   stats socket ${STATS_ADDR}-11 level admin process 11
> >>   stats socket ${STATS_ADDR}-12 level admin process 12
> >>
> >> Basically, we have a dedicate admin socket for each ("real&

Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Olivier Houchard
On Thu, Apr 13, 2017 at 03:06:47PM +0200, Conrad Hoffmann wrote:
> 
> 
> On 04/13/2017 02:28 PM, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
> >> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
> >>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
> >>>> Hi Olivier,
> >>>>
> >>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
> >>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> >>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> >>>>>>> Hi again,
> >>>>>>>
> >>>>>>> so I tried to get this to work, but didn't manage yet. I also don't 
> >>>>>>> quite
> >>>>>>> understand how this is supposed to work. The first haproxy process is
> >>>>>>> started _without_ the -x option, is that correct? Where does that 
> >>>>>>> instance
> >>>>>>> ever create the socket for transfer to later instances?
> >>>>>>>
> >>>>>>> I have it working now insofar that on reload, subsequent instances are
> >>>>>>> spawned with the -x option, but they'll just complain that they can't 
> >>>>>>> get
> >>>>>>> anything from the unix socket (because, for all I can tell, it's not
> >>>>>>> there?). I also can't see the relevant code path where this socket 
> >>>>>>> gets
> >>>>>>> created, but I didn't have time to read all of it yet.
> >>>>>>>
> >>>>>>> Am I doing something wrong? Did anyone get this to work with the
> >>>>>>> systemd-wrapper so far?
> >>>>>>>
> >>>>>>> Also, but this might be a coincidence, my test setup takes a huge
> >>>>>>> performance penalty just by applying your patches (without any 
> >>>>>>> reloading
> >>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers 
> >>>>>>> and
> >>>>>>> more details tomorrow.
> >>>>>>>
> >>>>>>
> >>>>>> Ok I can confirm the performance issues, I'm investigating.
> >>>>>>
> >>>>>
> >>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
> >>>>
> >>>> 
> >>>>
> >>>> thanks a lot, I can confirm that the performance regression seems to be 
> >>>> gone!
> >>>>
> >>>> I am still having the other (conceptual) problem, though. Sorry if this 
> >>>> is
> >>>> just me holding it wrong or something, it's been a while since I dug
> >>>> through the internals of haproxy.
> >>>>
> >>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
> >>>> which in turn starts haproxy in daemon mode, giving us a process tree 
> >>>> like
> >>>> this (path and file names shortened for brevity):
> >>>>
> >>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
> >>>> \_ /u/s/haproxy-master
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>>
> >>>> Now, in our config file, we have something like this:
> >>>>
> >>>> # expose admin socket for each process
> >>>>   stats socket ${STATS_ADDR}   level 

Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Olivier Houchard
On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> Sure, here it is ;P
> 
> I now get a segfault (on reload):
> 
> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> 0x05b511e0 ***
> 
> Here is the backtrace, retrieved from the core file:
> 
> (gdb) bt
> #0  0x7f4c92801067 in __GI_raise (sig=sig@entry=6) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x7f4c92802448 in __GI_abort () at abort.c:89
> #2  0x7f4c9283f1b4 in __libc_message (do_abort=do_abort@entry=1,
> fmt=fmt@entry=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> ../sysdeps/posix/libc_fatal.c:175
> #3  0x7f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> "corrupted double-linked list", ptr=) at malloc.c:4996
> #4  0x7f4c92845923 in _int_free (av=0x7f4c92b71620 ,
> p=, have_lock=0) at malloc.c:3996
> #5  0x00485850 in tcp_find_compatible_fd (l=0xaaed20) at
> src/proto_tcp.c:812
> #6  tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> errlen=100) at src/proto_tcp.c:878
> #7  0x00493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> #8  0x004091ec in main (argc=21, argv=0x7ffccc775168) at
> src/haproxy.c:1942

Ok, yet another stupid mistake, hopefully the attached patch fixes this :)

Thanks !

Olivier
>From 7c7fe0c00129d60617cba786cbec7bbdd9ce08f8 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 13 Apr 2017 17:06:53 +0200
Subject: [PATCH 12/12] BUG/MINOR: Properly remove the xfer_sock from the
 linked list.

Doubly linked list are hard to get right.
---
 src/proto_tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index f558f00..57d6fc1 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -806,7 +806,7 @@ static int tcp_find_compatible_fd(struct listener *l)
if (xfer_sock->prev)
xfer_sock->prev->next = xfer_sock->next;
if (xfer_sock->next)
-   xfer_sock->next->prev = xfer_sock->next->prev;
+   xfer_sock->next->prev = xfer_sock->prev;
free(xfer_sock->iface);
free(xfer_sock->namespace);
free(xfer_sock);
-- 
2.9.3



Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Olivier Houchard
On Thu, Apr 13, 2017 at 06:00:59PM +0200, Conrad Hoffmann wrote:
> On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> >> Sure, here it is ;P
> >>
> >> I now get a segfault (on reload):
> >>
> >> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> >> 0x05b511e0 ***
> >>
> >> Here is the backtrace, retrieved from the core file:
> >>
> >> (gdb) bt
> >> #0  0x7f4c92801067 in __GI_raise (sig=sig@entry=6) at
> >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> >> #1  0x7f4c92802448 in __GI_abort () at abort.c:89
> >> #2  0x7f4c9283f1b4 in __libc_message (do_abort=do_abort@entry=1,
> >> fmt=fmt@entry=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> >> ../sysdeps/posix/libc_fatal.c:175
> >> #3  0x7f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> >> "corrupted double-linked list", ptr=) at malloc.c:4996
> >> #4  0x7f4c92845923 in _int_free (av=0x7f4c92b71620 ,
> >> p=, have_lock=0) at malloc.c:3996
> >> #5  0x00485850 in tcp_find_compatible_fd (l=0xaaed20) at
> >> src/proto_tcp.c:812
> >> #6  tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> >> errlen=100) at src/proto_tcp.c:878
> >> #7  0x00493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> >> #8  0x004091ec in main (argc=21, argv=0x7ffccc775168) at
> >> src/haproxy.c:1942
> > 
> > Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
> > 
> > Thanks !
> > 
> > Olivier
> 
> 
> It does indeed! Not only does it work now, the result is impressive! Not a
> single dropped request even when aggressively reloading under substantial 
> load!
> 
> You certainly gave me an unexpected early easter present here :)
> 
> I will now head out, but I am planning on installing a 1.8 version with
> your patches on our canary pool (which receives a small amount of
> production traffic to test changes) after the holidays. I will happily test
> anything else that might be helpful for you. I will also set up a proper
> load test inside our data center then, but I expect no surprises there (my
> current tests were done over a VPN link, somewhat limiting the achievable
> throughput).
> 
> Once more, thank you so much! This will greatly simplify much of our
> operations. If there is anything else we can help test, let me know :)

Pfew, at least :) Thanks a lot for your patience, and doing all that testing !

Olivier



Re: [RFC][PATCHES] seamless reload

2017-04-19 Thread Olivier Houchard
On Wed, Apr 19, 2017 at 09:58:27AM +0200, Pavlos Parissis wrote:
> On 13/04/2017 06:18 μμ, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 06:00:59PM +0200, Conrad Hoffmann wrote:
> >> On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> >>> On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> >>>> Sure, here it is ;P
> >>>>
> >>>> I now get a segfault (on reload):
> >>>>
> >>>> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> >>>> 0x05b511e0 ***
> >>>>
> >>>> Here is the backtrace, retrieved from the core file:
> >>>>
> >>>> (gdb) bt
> >>>> #0  0x7f4c92801067 in __GI_raise (sig=sig@entry=6) at
> >>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> >>>> #1  0x7f4c92802448 in __GI_abort () at abort.c:89
> >>>> #2  0x7f4c9283f1b4 in __libc_message (do_abort=do_abort@entry=1,
> >>>> fmt=fmt@entry=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> >>>> ../sysdeps/posix/libc_fatal.c:175
> >>>> #3  0x7f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> >>>> "corrupted double-linked list", ptr=) at malloc.c:4996
> >>>> #4  0x7f4c92845923 in _int_free (av=0x7f4c92b71620 ,
> >>>> p=, have_lock=0) at malloc.c:3996
> >>>> #5  0x00485850 in tcp_find_compatible_fd (l=0xaaed20) at
> >>>> src/proto_tcp.c:812
> >>>> #6  tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> >>>> errlen=100) at src/proto_tcp.c:878
> >>>> #7  0x00493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> >>>> #8  0x004091ec in main (argc=21, argv=0x7ffccc775168) at
> >>>> src/haproxy.c:1942
> >>>
> >>> Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
> >>>
> >>> Thanks !
> >>>
> >>> Olivier
> >>
> >>
> >> It does indeed! Not only does it work now, the result is impressive! Not a
> >> single dropped request even when aggressively reloading under substantial 
> >> load!
> >>
> >> You certainly gave me an unexpected early easter present here :)
> >>
> >> I will now head out, but I am planning on installing a 1.8 version with
> >> your patches on our canary pool (which receives a small amount of
> >> production traffic to test changes) after the holidays. I will happily test
> >> anything else that might be helpful for you. I will also set up a proper
> >> load test inside our data center then, but I expect no surprises there (my
> >> current tests were done over a VPN link, somewhat limiting the achievable
> >> throughput).
> >>
> >> Once more, thank you so much! This will greatly simplify much of our
> >> operations. If there is anything else we can help test, let me know :)
> > 
> > Pfew, at least :) Thanks a lot for your patience, and doing all that 
> > testing !
> > 
> > Olivier
> > 
> 
> 
> Joining this again a bit late, do you still want me to test it?
> I would like to know if there are any performance impact, but I see that
> Conrad Hoffmann has done all the hard work on this. So, we can conclude that 
> we
> don't expect any performance impact.
> 
> Once again thanks a lot for your hard work on this.
> @Conrad Hoffmann, thanks a lot for testing this and checking if there is any
> performance impact.
> 


Hi Pavlos,

More testing is always appreciated :)
I don't expect any performance impact, but that wouldn't be the first time
I say that and am proven wrong, though as you said with all the testing
Conrad did, I'm fairly confident it should be OK.

Thanks !

Olivier





[PATCH] Fix haproxy hangs on FreeBSD >= 11

2017-04-19 Thread Olivier Houchard
Hi guys,

Thanks to your help, we finally figure out what was happening on FreeBSD,
and the attached patch should fix it.
Problem was, haproxy relies on what is really undefined behavior in C, with
signed integer overflows. gcc and earlier versions of clang behaved as we
expected, but newer versions of clang do not, which is why it started to be
a problem with FreeBSD 11 (and is probably on other BSDs using clang, and
MacOS X).

Thanks again for giving us enough informations to really figure that out, it
would have been _tough_ overwise :)

Regards,

Olivier
>From 163be439a8bc6e5aa1cf3fea0f086d518ddad0a9 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 19 Apr 2017 11:34:10 +0200
Subject: [PATCH] BUG/MAJOR: Use -fwrapv.

Haproxy relies on signed integer wraparound on overflow, however this is
really an undefined behavior, so the C compiler is allowed to do whatever
it wants, and clang does exactly that, and that causes problems when the
timer goes from <= INT_MAX to > INT_MAX, and explains the various hangs
reported on FreeBSD every 49.7 days. To make sure we get the intended
behavior, use -fwrapv for now. A proper fix is to switch everything to
unsigned, and it will happen later, but this is simpler, and more likely to
be backported to the stable branches.
Many thanks to David King, Mark S, Dave Cottlehuber, Slawa Olhovchenkov,
Piotr Pawel Stefaniak, and any other I may have forgotten for reporting that
 and investigating.
---
 Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index a87e1e2..63db432 100644
--- a/Makefile
+++ b/Makefile
@@ -131,7 +131,10 @@ DEBUG_CFLAGS = -g
  Compiler-specific flags that may be used to disable some negative over-
 # optimization or to silence some warnings. -fno-strict-aliasing is needed with
 # gcc >= 4.4.
-SPEC_CFLAGS = -fno-strict-aliasing -Wdeclaration-after-statement
+# We rely on signed integer wraparound on overflow, however clang think it
+# can do whatever it wants since it's an undefined behavior, so use -fwrapv
+# to be sure e get the intended behavior.
+SPEC_CFLAGS = -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv
 
  Memory usage tuning
 # If small memory footprint is required, you can reduce the buffer size. There
-- 
2.9.3



[PATCH] minor harmless bugfix in server_parse_sni_expr

2017-04-20 Thread Olivier Houchard
Hi,

In server_parse_sni_expr(), we use the "proxy" global variable when I think
we really want to use the "px" argument, so the attached patch fixes this.
Hopefully one day that proxy variable will be renamed :)

Olivier
>From eb5033db545ae093f73485e4a29c112e126c159c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 20 Apr 2017 18:21:17 +0200
Subject: [PATCH] MINOR: server: don't use "proxy" when px is really meant.

In server_parse_sni_expr(), we use the "proxy" global variable, when we
should probably be using "px" given as an argument.
It happens to work by accident right now, but my not in the future.
---
 src/server.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/server.c b/src/server.c
index 878293f..a151fd4 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1539,10 +1539,10 @@ static int server_parse_sni_expr(struct server *newsrv, 
struct proxy *px, char *
};
 
idx = 0;
-   proxy->conf.args.ctx = ARGC_SRV;
+   px->conf.args.ctx = ARGC_SRV;
 
expr = sample_parse_expr((char **)args, &idx, px->conf.file, 
px->conf.line,
-err, &proxy->conf.args);
+err, &px->conf.args);
if (!expr) {
memprintf(err, "error detected while parsing sni expression : 
%s", *err);
return ERR_ALERT | ERR_FATAL;
-- 
2.9.3



Re: [RFC][PATCHES] seamless reload

2017-05-04 Thread Olivier Houchard
On Thu, May 04, 2017 at 10:03:07AM +, Pierre Cheynier wrote:
> Hi Olivier,
> 
> Many thanks for that ! As you know, we are very interested on this topic.
> We'll test your patches soon for sure.
> 
> Pierre

Hi Pierre :)

Thanks ! I'm very interested in knowing how well it works for you.
Maybe we can talk about that around a beer sometime.

Olivier



Re: [RFC][PATCHES] seamless reload

2017-05-08 Thread Olivier Houchard
Hi Pavlos,

On Sun, May 07, 2017 at 12:05:28AM +0200, Pavlos Parissis wrote:
[...]
> Ignore ignore what I wrote, I am an idiot I am an idiot as I forgot the most
> important bit of the test, to enable the seamless reload by suppling the
> HAPROXY_STATS_SOCKET environment variable:-(
> 
> I added to the systemd overwrite file:
> [Service]
> Environment=CONFIG="/etc/lb_engine/haproxy.cfg"
> "HAPROXY_STATS_SOCKET=/run/lb_engine/process-1.sock"
> 
> and wrk2 reports ZERO errors where with HAPEE reports ~49.
> 
> I am terrible sorry for this stupid mistake.
> 
> But, this mistake revealed something interesting. The fact that with the 
> latest
> code we have more errors during reload.
> 
> @Olivier, great work dude. I am waiting for this to be back-ported to 
> HAPEE-1.7r1.
> 
> Once again I am sorry for my mistake,
> Pavlos
> 

Thanks a lot for testing !
This is interesting indeed. My patch may make it worse when not passing
fds via the unix socket, as all processes now keep all sockets opened, even
the one they're not using, maybe it make the window between the last
accept and the close bigger.
If that is so, then the global option "no-unused-socket" should provide
a comparable error rate.

Regards,

Olivier



Minor bugfix

2017-07-17 Thread Olivier Houchard
Hi guys,

The attached patch fixes a potential use after free, if for some reason we
failed to get the address of a transfered socket.
It should be fairly safe to apply.

Regards,

Olivier
>From 6fa0e381b38d3a9a3d29e59cbcca34fb1d375e3e Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 17 Jul 2017 17:25:33 +0200
Subject: [PATCH] BUG/MINOR: Prevent a use-after-free on error scenario.

---
 src/haproxy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/haproxy.c b/src/haproxy.c
index 23161007..7af1092d 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -997,6 +997,7 @@ static int get_old_sockets(const char *unixsocket)
if (getsockname(fd, (struct sockaddr *)&xfer_sock->addr, 
&socklen) != 0) {
Warning("Failed to get socket address\n");
free(xfer_sock);
+   xfer_sock = NULL;
continue;
}
if (curoff >= maxoff) {
-- 
2.13.3



[PATCHES] SRV record support

2017-08-04 Thread Olivier Houchard
Hi guys,

Following Baptiste's work on DNS, the attached patchset adds support for DNS
obsolescence, and SRV support.
DNS obsolescence means we cache DNS answers, and only consider the entries
are gone if we don't see for X seconds, X being defined in the config file
with the "hold obsolete" entry in the resolvers section, ie :
esolvers dns
nameserver pouet 8.8.8.8:53
hold valid 10s
hold obsolete 5s

This is done as we may get incomplete DNS answers for each request, so we
can't assume an entry is gone just because it was not in a DNS answer.
This also adds support for SRV records. To use them, simply use a SRV label
instead of a hostname on the server line, ie :
server s1 _http._tcp.example.com  resolvers dns check
server s2 _http._tcp.example.com  resolvers dns check

When this is done, haproxy will first resolve _http._tcp.example.com, and then
give the hostname (as well as port and weight) to each available server, that
will then do a regular DNS resolution to get the IP.
The SRV label is resolved periodically, any server that disappeares will be
removed, and any new server will be added, assuming there're free servers in
the haproxy config.

Any testing would be greatly appreciated.

Regards,

Olivier

>From 1b408464590fea38d8a45b2b7fed5c615465a858 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 6 Jul 2017 18:46:47 +0200
Subject: [PATCH 1/4] MINOR: dns: Cache previous DNS answers.

As DNS servers may not return all IPs in one answer, we want to cache the
previous entries. Those entries are removed when considered obsolete, which
happens when the IP hasn't been returned by the DNS server for a time
defined in the "hold obsolete" parameter of the resolver section. The default
is 30s.
---
 doc/configuration.txt  |   7 +-
 include/proto/server.h |   2 +-
 include/types/dns.h|   9 +-
 src/cfgparse.c |   5 +-
 src/dns.c  | 247 -
 src/server.c   |  28 --
 6 files changed, 175 insertions(+), 123 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index bfeb3ce0..f4674387 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -11693,6 +11693,10 @@ For example, with 2 name servers configured in a 
resolvers section:
  - first response is truncated and second one is a NX Domain, then HAProxy
stops resolution.
 
+As a DNS server may not answer all the IPs in one DNS request, haproxy keeps
+a cache of previous answers, an answer will be considered obsolete after
+"hold obsolete" seconds without the IP returned.
+
 
 resolvers 
   Creates a new name server list labelled 
@@ -11709,7 +11713,7 @@ hold  
   Defines  during which the last name resolution should be kept based
   on last resolution 
  : last name resolution status. Acceptable values are "nx",
-   "other", "refused", "timeout", "valid".
+   "other", "refused", "timeout", "valid", "obsolete".
  : interval between two successive name resolution when the last
answer was in . It follows the HAProxy time format.
 is in milliseconds by default.
@@ -11756,6 +11760,7 @@ timeout  
  hold nx  30s
  hold timeout 30s
  hold valid   10s
+ hold obsolete30s
 
 
 6. HTTP header manipulation
diff --git a/include/proto/server.h b/include/proto/server.h
index 43e4e425..c4f8e1d5 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -52,7 +52,7 @@ int srv_init_addr(void);
 struct server *cli_find_server(struct appctx *appctx, char *arg);
 
 /* functions related to server name resolution */
-int snr_update_srv_status(struct server *s);
+int snr_update_srv_status(struct server *s, int has_no_ip);
 int snr_resolution_cb(struct dns_requester *requester, struct dns_nameserver 
*nameserver);
 int snr_resolution_error_cb(struct dns_requester *requester, int error_code);
 struct server *snr_check_ip_callback(struct server *srv, void *ip, unsigned 
char *ip_family);
diff --git a/include/types/dns.h b/include/types/dns.h
index 7a19aa37..12c11552 100644
--- a/include/types/dns.h
+++ b/include/types/dns.h
@@ -113,7 +113,7 @@ struct dns_query_item {
 /* NOTE: big endian structure */
 struct dns_answer_item {
struct list list;
-   char *name; /* answer name
+   char name[DNS_MAX_NAME_SIZE];   /* answer name
 * For SRV type, name also 
includes service
 * and protocol value */
int16_t type;   /* question type */
@@ -124,7 +124,8 @@ struct dns_answer_item {
int16_t port;   /* SRV type port */
int16_t data_len;

Re: [PATCHES] SRV record support

2017-08-07 Thread Olivier Houchard
Hi,

On Fri, Aug 04, 2017 at 09:18:30PM +0200, Willy Tarreau wrote:
> Just a few questions and minor comments below :
> 
> On Fri, Aug 04, 2017 at 06:49:43PM +0200, Olivier Houchard wrote:
> > This also adds support for SRV records. To use them, simply use a SRV label
> > instead of a hostname on the server line, ie :
> > server s1 _http._tcp.example.com  resolvers dns check
> > server s2 _http._tcp.example.com  resolvers dns check
> > 
> > When this is done, haproxy will first resolve _http._tcp.example.com, and 
> > then
> > give the hostname (as well as port and weight) to each available server, 
> > that
> > will then do a regular DNS resolution to get the IP.
> 
> What makes the distinction between an SRV record and a real hostname here ?
> Just the leading underscore, or the plain "_http." maybe ? I'm not expecting
> any problem with this given that the underscore is not allowed as a regular
> hostname character (except under windows). But at least this will deserve
> a mention in the doc where the server's address is described, so that anyone
> experiencing trouble could spot this easily.
> 

Yes, whatever starts with an underscore, I didn't want to rely on _http, as
it may not be http. As underscores aren't supposed to be present in hostnames,
and I certainly hope even people using them don't have hostname starting with
one, I thought it was OK.
It is documented in the updated patches attached.

> > From 1b408464590fea38d8a45b2b7fed5c615465a858 Mon Sep 17 00:00:00 2001
> > From: Olivier Houchard 
> > Date: Thu, 6 Jul 2017 18:46:47 +0200
> > Subject: [PATCH 1/4] MINOR: dns: Cache previous DNS answers.
> > 
> > As DNS servers may not return all IPs in one answer, we want to cache the
> > previous entries. Those entries are removed when considered obsolete, which
> > happens when the IP hasn't been returned by the DNS server for a time
> > defined in the "hold obsolete" parameter of the resolver section. The 
> > default
> > is 30s.
> > ---
> >  doc/configuration.txt  |   7 +-
> >  include/proto/server.h |   2 +-
> >  include/types/dns.h|   9 +-
> >  src/cfgparse.c |   5 +-
> >  src/dns.c  | 247 
> > -
> >  src/server.c   |  28 --
> >  6 files changed, 175 insertions(+), 123 deletions(-)
> > 
> > diff --git a/doc/configuration.txt b/doc/configuration.txt
> > index bfeb3ce0..f4674387 100644
> > --- a/doc/configuration.txt
> > +++ b/doc/configuration.txt
> > @@ -11693,6 +11693,10 @@ For example, with 2 name servers configured in a 
> > resolvers section:
> >   - first response is truncated and second one is a NX Domain, then HAProxy
> > stops resolution.
> >  
> > +As a DNS server may not answer all the IPs in one DNS request, haproxy 
> > keeps
> > +a cache of previous answers, an answer will be considered obsolete after
> > +"hold obsolete" seconds without the IP returned.
> > +
> >  
> >  resolvers 
> >Creates a new name server list labelled 
> > @@ -11709,7 +11713,7 @@ hold  
> >Defines  during which the last name resolution should be kept 
> > based
> >on last resolution 
> >   : last name resolution status. Acceptable values are "nx",
> > -   "other", "refused", "timeout", "valid".
> > +   "other", "refused", "timeout", "valid", "obsolete".
> >   : interval between two successive name resolution when the 
> > last
> > answer was in . It follows the HAProxy time format.
> >  is in milliseconds by default.
> > @@ -11756,6 +11760,7 @@ timeout  
> >   hold nx  30s
> >   hold timeout 30s
> >   hold valid   10s
> > + hold obsolete30s
> >  
> >  
> >  6. HTTP header manipulation
> > diff --git a/include/proto/server.h b/include/proto/server.h
> > index 43e4e425..c4f8e1d5 100644
> > --- a/include/proto/server.h
> > +++ b/include/proto/server.h
> > @@ -52,7 +52,7 @@ int srv_init_addr(void);
> >  struct server *cli_find_server(struct appctx *appctx, char *arg);
> >  
> >  /* functions related to server name resolution */
> > -int snr_update_srv_status(struct server *s);
> > +int snr_update_srv_status(struct server *s, int has_no_ip);
> >  int snr_resolution_cb(struct dns_requester *requester, struct 
> > dns_nameserver *nameserver);
> &

Re: [PATCHES] SRV record support

2017-08-09 Thread Olivier Houchard

Hi,

After some review and tests by Baptiste, here comes an updated patchset,
with a few bugfixes.
This one is probably mergeable.

Regards,

Olivier
>From 1b408464590fea38d8a45b2b7fed5c615465a858 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 6 Jul 2017 18:46:47 +0200
Subject: [PATCH 1/6] MINOR: dns: Cache previous DNS answers.

As DNS servers may not return all IPs in one answer, we want to cache the
previous entries. Those entries are removed when considered obsolete, which
happens when the IP hasn't been returned by the DNS server for a time
defined in the "hold obsolete" parameter of the resolver section. The default
is 30s.
---
 doc/configuration.txt  |   7 +-
 include/proto/server.h |   2 +-
 include/types/dns.h|   9 +-
 src/cfgparse.c |   5 +-
 src/dns.c  | 247 -
 src/server.c   |  28 --
 6 files changed, 175 insertions(+), 123 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index bfeb3ce0..f4674387 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -11693,6 +11693,10 @@ For example, with 2 name servers configured in a 
resolvers section:
  - first response is truncated and second one is a NX Domain, then HAProxy
stops resolution.
 
+As a DNS server may not answer all the IPs in one DNS request, haproxy keeps
+a cache of previous answers, an answer will be considered obsolete after
+"hold obsolete" seconds without the IP returned.
+
 
 resolvers 
   Creates a new name server list labelled 
@@ -11709,7 +11713,7 @@ hold  
   Defines  during which the last name resolution should be kept based
   on last resolution 
  : last name resolution status. Acceptable values are "nx",
-   "other", "refused", "timeout", "valid".
+   "other", "refused", "timeout", "valid", "obsolete".
  : interval between two successive name resolution when the last
answer was in . It follows the HAProxy time format.
 is in milliseconds by default.
@@ -11756,6 +11760,7 @@ timeout  
  hold nx  30s
  hold timeout 30s
  hold valid   10s
+ hold obsolete30s
 
 
 6. HTTP header manipulation
diff --git a/include/proto/server.h b/include/proto/server.h
index 43e4e425..c4f8e1d5 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -52,7 +52,7 @@ int srv_init_addr(void);
 struct server *cli_find_server(struct appctx *appctx, char *arg);
 
 /* functions related to server name resolution */
-int snr_update_srv_status(struct server *s);
+int snr_update_srv_status(struct server *s, int has_no_ip);
 int snr_resolution_cb(struct dns_requester *requester, struct dns_nameserver 
*nameserver);
 int snr_resolution_error_cb(struct dns_requester *requester, int error_code);
 struct server *snr_check_ip_callback(struct server *srv, void *ip, unsigned 
char *ip_family);
diff --git a/include/types/dns.h b/include/types/dns.h
index 7a19aa37..12c11552 100644
--- a/include/types/dns.h
+++ b/include/types/dns.h
@@ -113,7 +113,7 @@ struct dns_query_item {
 /* NOTE: big endian structure */
 struct dns_answer_item {
struct list list;
-   char *name; /* answer name
+   char name[DNS_MAX_NAME_SIZE];   /* answer name
 * For SRV type, name also 
includes service
 * and protocol value */
int16_t type;   /* question type */
@@ -124,7 +124,8 @@ struct dns_answer_item {
int16_t port;   /* SRV type port */
int16_t data_len;   /* number of bytes in target 
below */
struct sockaddr address;/* IPv4 or IPv6, network format 
*/
-   char *target;   /* Response data: SRV or CNAME 
type target */
+   char target[DNS_MAX_NAME_SIZE]; /* Response data: SRV or CNAME 
type target */
+   time_t last_seen;   /* When was the answer was last 
seen */
 };
 
 struct dns_response_packet {
@@ -158,6 +159,7 @@ struct dns_resolvers {
int timeout;/*   no answer was delivered */
int refused;/*   dns server refused to answer */
int other;  /*   other dns response errors */
+   int obsolete;   /*   an answer hasn't been seen */
} hold;
struct task *t; /* timeout management */
int resolution_pool_size;   /* size of the resolution pool 
associated to this resolvers section */
@@ -252,8 +254,6 @@ struct dns_resolution {
unsigned long long revision;/* updated for each update */
struct dns_response_pac

[PATCH][MINOR] rename the raw socket constructor

2017-08-14 Thread Olivier Houchard
Hi,

I just noticed the raw socket constructor was called __ssl_sock_deinit,
which is a bit confusing, and wrong twice, so the attached patch renames it
to __raw_sock_init, which seems more correct.
This is purely cosmetics.

Regards,

Olivier
>From 14821426ab9ff73fdb27a7a2259e54f2fd46f6bc Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 14 Aug 2017 15:59:44 +0200
Subject: [PATCH] MINOR: Use a better name for the constructor than
 __ssl_sock_deinit()

---
 src/raw_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/raw_sock.c b/src/raw_sock.c
index 8aba58d7..707ba55f 100644
--- a/src/raw_sock.c
+++ b/src/raw_sock.c
@@ -429,7 +429,7 @@ static struct xprt_ops raw_sock = {
 
 
 __attribute__((constructor))
-static void __ssl_sock_deinit(void)
+static void __raw_sock_init(void)
 {
xprt_register(XPRT_RAW, &raw_sock);
 }
-- 
2.13.3



Re: FreeBSD CPU Affinity

2017-08-16 Thread Olivier Houchard
On Wed, Aug 16, 2017 at 11:28:52AM -0400, Mark Staudinger wrote:
> On Wed, 16 Aug 2017 10:47:32 -0400, Dmitry Sivachenko 
> wrote:
> 
> > 
> > > On 16 Aug 2017, at 17:40, Mark Staudinger 
> > > wrote:
> > > 
> > > On Wed, 16 Aug 2017 10:35:05 -0400, Dmitry Sivachenko
> > >  wrote:
> > > 
> > > > Hello,
> > > > 
> > > > are you installing haproxy form FreeBSD ports?
> > > > 
> > > > I just tried your configuration and it works as you expect.
> > > > 
> > > > If you are building haproxy by hand, add USE_CPU_AFFINITY=1
> > > > parameter to make manually.  FreeBSD port do that for you.
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > Hi Dmitry,
> > > 
> > > I am running (for now) a locally compiled from source version.
> > > 
> > > Build options :
> > >  TARGET  = freebsd
> > >  CPU = generic
> > >  CC  = clang
> > >  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> > > -fwrapv
> > >  OPTIONS = USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1
> > > USE_STATIC_PCRE=1 USE_PCRE_JIT=1
> > 
> > 
> > 
> > Strange.  I am testing on FreeBSD-10-stable though.
> > 
> > May be you add return code check for cpuset_setaffinity() and log
> > possible error?
> 
> Output of from truss on starup yields this:
> 
>  3862: cpuset_setaffinity(0x3,0x2,0x,0x8,0x773dd0) ERR#34
> 'Result too large'
>  3863: cpuset_setaffinity(0x3,0x2,0x,0x8,0x773dd8) ERR#34
> 'Result too large'
> 

Hi Mark,

I think I know what's going on.
Can you try the attached patch ?

Thanks !

Olivier
>From d01f5a680cd8dcfb8d90da05cd7474d29c759fa9 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 16 Aug 2017 17:29:11 +0200
Subject: [PATCH] MINOR: Fix CPU usage on FreeBSD.

Use a cpuset_t instead of assuming the cpu mask is an unsigned long.
This should fix setting the CPU affinity on FreeBSD >= 11.
---
 src/haproxy.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index 52722db8..67347d68 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2556,7 +2556,18 @@ int main(int argc, char **argv)
proc < LONGBITS &&   /* only the first 32/64 processes 
may be pinned */
global.cpu_map[proc])/* only do this if the process has 
a CPU map */
 #ifdef __FreeBSD__
-   cpuset_setaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID, -1, 
sizeof(unsigned long), (void *)&global.cpu_map[proc]);
+   {
+   cpuset_t cpuset;
+   int i;
+   unsigned long cpu_map = global.cpu_map[proc];
+
+   CPU_ZERO(&cpuset);
+   while ((i = ffsl(cpu_map)) > 0) {
+   CPU_SET(i - 1, &cpuset);
+   cpu_map &= ~(1 << (i - 1));
+   }
+   ret = cpuset_setaffinity(CPU_LEVEL_WHICH, 
CPU_WHICH_PID, -1, sizeof(cpuset), &cpuset);
+   }
 #else
sched_setaffinity(0, sizeof(unsigned long), (void 
*)&global.cpu_map[proc]);
 #endif
-- 
2.11.1



Re: FreeBSD CPU Affinity

2017-08-16 Thread Olivier Houchard
On Wed, Aug 16, 2017 at 11:43:30AM -0400, Mark Staudinger wrote:
> On Wed, 16 Aug 2017 11:32:01 -0400, Olivier Houchard 
> wrote:
> 
> > On Wed, Aug 16, 2017 at 11:28:52AM -0400, Mark Staudinger wrote:
> > > On Wed, 16 Aug 2017 10:47:32 -0400, Dmitry Sivachenko
> > > 
> > > wrote:
> > > 
> > > >
> > > > > On 16 Aug 2017, at 17:40, Mark Staudinger 
> > > > > wrote:
> > > > >
> > > > > On Wed, 16 Aug 2017 10:35:05 -0400, Dmitry Sivachenko
> > > > >  wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > are you installing haproxy form FreeBSD ports?
> > > > > >
> > > > > > I just tried your configuration and it works as you expect.
> > > > > >
> > > > > > If you are building haproxy by hand, add USE_CPU_AFFINITY=1
> > > > > > parameter to make manually.  FreeBSD port do that for you.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > Hi Dmitry,
> > > > >
> > > > > I am running (for now) a locally compiled from source version.
> > > > >
> > > > > Build options :
> > > > >  TARGET  = freebsd
> > > > >  CPU = generic
> > > > >  CC  = clang
> > > > >  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> > > > > -fwrapv
> > > > >  OPTIONS = USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1
> > > > > USE_STATIC_PCRE=1 USE_PCRE_JIT=1
> > > >
> > > >
> > > >
> > > > Strange.  I am testing on FreeBSD-10-stable though.
> > > >
> > > > May be you add return code check for cpuset_setaffinity() and log
> > > > possible error?
> > > 
> > > Output of from truss on starup yields this:
> > > 
> > >  3862: cpuset_setaffinity(0x3,0x2,0x,0x8,0x773dd0)
> > > ERR#34
> > > 'Result too large'
> > >  3863: cpuset_setaffinity(0x3,0x2,0x,0x8,0x773dd8)
> > > ERR#34
> > > 'Result too large'
> > > 
> > 
> > Hi Mark,
> > 
> > I think I know what's going on.
> > Can you try the attached patch ?
> > 
> > Thanks !
> > 
> > Olivier
> 
> Hi Olivier,
> 
> The patch checks out - affinity is set properly on both FreeBSD-10 and
> FreBSD-11.
> 

Thanks for testing !

Willy, can you apply this ?
Of course, I forgot to mention it in the commit log, but this should probably
be backported to supported releases.

Regards,

Olivier



Re: FreeBSD CPU Affinity

2017-08-17 Thread Olivier Houchard
On Thu, Aug 17, 2017 at 04:27:55PM +0300, Dmitry Sivachenko wrote:
> 
> > On 16 Aug 2017, at 18:32, Olivier Houchard  wrote:
> > 
> > 
> > 
> > I think I know what's going on.
> > Can you try the attached patch ?
> > 
> > Thanks !
> > 
> > Olivier
> > <0001-MINOR-Fix-CPU-usage-on-FreeBSD.patch>
> 
> 
> Also, it would be probably correct thing to check return code from 
> cpuset_setaffinity() and treat it as fatal error, aborting haproxy startup.
> 
> It is better to get an error message on start rather than guess why it does 
> not work as expected.

I don't know if it should be fatal, but I certainly agree at least a warning
would be welcome.

Regards,

Olivier



[PATCH][MINOR] Inline functions in common/net_helper.h

2017-09-13 Thread Olivier Houchard
Hi guys,

For some reason, functions in include/common/net_helper.h aren't inlined,
which will certainly be a problem the day we have multiple users of that
file.
The attached patch addresses this.

Regards,

Olivier
>From cd330f4d3befa15aa4956865445273ee7b657076 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 13 Sep 2017 11:49:22 +0200
Subject: [PATCH] MINOR: net_helper: Inline functions meant to be inlined.

---
 include/common/net_helper.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/common/net_helper.h b/include/common/net_helper.h
index 86809a684..78975144c 100644
--- a/include/common/net_helper.h
+++ b/include/common/net_helper.h
@@ -31,42 +31,42 @@
 /* Functions to read various integer that may be unaligned */
 
 /* Read a uint16_t */
-uint16_t readu16(const void *p)
+static inline uint16_t readu16(const void *p)
 {
 const union {  uint16_t u16; } __attribute__((packed))*u = p;
 return u->u16;
 }
 
 /* Read a int16_t */
-int16_t readi16(const void *p)
+static inline int16_t readi16(const void *p)
 {
 const union {  int16_t i16; } __attribute__((packed))*u = p;
 return u->i16;
 }
 
 /* Read a uint16_t, and convert from network order to host order */
-uint16_t readn16(const void *p)
+static inline uint16_t readn16(const void *p)
 {
 const union {  uint16_t u16; } __attribute__((packed))*u = p;
 return ntohs(u->u16);
 }
 
 /* Read a uint32_t */
-uint32_t readu32(const void *p)
+static inline uint32_t readu32(const void *p)
 {
 const union {  uint32_t u32; } __attribute__((packed))*u = p;
 return u->u32;
 }
 
 /* Read a int32_t */
-int16_t readi32(const void *p)
+static inline int16_t readi32(const void *p)
 {
 const union {  int32_t i32; } __attribute__((packed))*u = p;
 return u->i32;
 }
 
 /* Read a uint32_t, and convert from network order to host order */
-uint32_t readn32(const void *p)
+static inline uint32_t readn32(const void *p)
 {
 const union {  uint32_t u32; } __attribute__((packed))*u = p;
 return ntohl(u->u32);
-- 
2.13.3



[PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-02 Thread Olivier Houchard
 }
else {
ssl_ctx = ssl_sock_do_create_cert(servername, bind_conf, ssl);
SSL_set_SSL_CTX(ssl, ssl_ctx);
/* No LRU cache, this CTX will be released as soon as the 
session dies */
SSL_CTX_free(ssl_ctx);
+   return 1;
}
-   return ssl_ctx;
+   return 0;
+}
+static int
+ssl_sock_generate_certificate_from_conn(struct bind_conf *bind_conf, SSL *ssl)
+{
+   unsigned int key;
+   SSL_CTX *ssl_ctx = NULL;
+   struct connection *conn = SSL_get_app_data(ssl);
+
+   conn_get_to_addr(conn);
+   if (conn->flags & CO_FL_ADDR_TO_SET) {
+   key = ssl_sock_generated_cert_key(&conn->addr.to, 
get_addr_len(&conn->addr.to));
+   ssl_ctx = ssl_sock_get_generated_cert(key, bind_conf);
+   if (ssl_ctx) {
+   /* switch ctx */
+   SSL_set_SSL_CTX(ssl, ssl_ctx);
+   return 1;
+   }
+   }
+   return 0;
 }
 #endif /* !defined SSL_NO_GENERATE_CERTIFICATES */
 
@@ -1954,10 +1975,10 @@ static void ssl_sock_switchctx_set(SSL *ssl, SSL_CTX 
*ctx)
 
 static int ssl_sock_switchctx_err_cbk(SSL *ssl, int *al, void *priv)
 {
+   struct bind_conf *s = priv;
(void)al; /* shut gcc stupid warning */
-   (void)priv;
 
-   if (!SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name))
+   if (!SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name) && 
!s->generate_certs)
return SSL_TLSEXT_ERR_NOACK;
return SSL_TLSEXT_ERR_OK;
 }
@@ -2021,6 +2042,11 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *arg)
servername = extension_data;
servername_len = len;
} else {
+#if (!defined SSL_NO_GENERATE_CERTIFICATES)
+   if (s->generate_certs && 
ssl_sock_generate_certificate_from_conn(s, ssl)) {
+   return 1;
+   }
+#endif
/* without SNI extension, is the default_ctx (need 
SSL_TLSEXT_ERR_NOACK) */
if (!s->strict_sni) {
ssl_sock_switchctx_set(ssl, s->default_ctx);
@@ -2164,6 +2190,12 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *arg)
methodVersions[conf->ssl_methods.max].ssl_set_version(ssl, 
SET_MAX);
return 1;
}
+#if (!defined SSL_NO_GENERATE_CERTIFICATES)
+   if (s->generate_certs && ssl_sock_generate_certificate(trash.str, s, 
ssl)) {
+   /* switch ctx done in ssl_sock_generate_certificate */
+   return 1;
+   }
+#endif
if (!s->strict_sni) {
/* no certificate match, is the default_ctx */
ssl_sock_switchctx_set(ssl, s->default_ctx);
@@ -2198,22 +2230,8 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *priv)
servername = SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name);
if (!servername) {
 #if (!defined SSL_NO_GENERATE_CERTIFICATES)
-   if (s->generate_certs) {
-   struct connection *conn = SSL_get_app_data(ssl);
-   unsigned int key;
-   SSL_CTX *ctx;
-
-   conn_get_to_addr(conn);
-   if (conn->flags & CO_FL_ADDR_TO_SET) {
-   key = 
ssl_sock_generated_cert_key(&conn->addr.to, get_addr_len(&conn->addr.to));
-   ctx = ssl_sock_get_generated_cert(key, s);
-   if (ctx) {
-   /* switch ctx */
-   SSL_set_SSL_CTX(ssl, ctx);
-   return SSL_TLSEXT_ERR_OK;
-   }
-   }
-   }
+   if (s->generate_certs && 
ssl_sock_generate_certificate_from_conn(s, ssl))
+   return SSL_TLSEXT_ERR_OK;
 #endif
if (s->strict_sni)
return SSL_TLSEXT_ERR_ALERT_FATAL;
@@ -2246,10 +2264,8 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *priv)
}
if (!node || container_of(node, struct sni_ctx, name)->neg) {
 #if (!defined SSL_NO_GENERATE_CERTIFICATES)
-   SSL_CTX *ctx;
-   if (s->generate_certs &&
-   (ctx = ssl_sock_generate_certificate(servername, s, ssl))) {
-   /* switch ctx */
+   if (s->generate_certs && 
ssl_sock_generate_certificate(servername, s, ssl)) {
+   /* switch ctx done in ssl_sock_generate_certificate */
return SSL_TLSEXT_ERR_OK;
}
 #endif
@@ -3677,8 +3693,8 @@ ssl_sock_initial_ctx(struct bind_conf *bind_conf)
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_

Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-02 Thread Olivier Houchard
Hi Igor,

On Tue, Oct 03, 2017 at 12:06:05AM +0800, Igor Pav wrote:
> It's excited, does server line(client side) support 0-rtt?
> 

Unfortunately, it does not yet. I'm investigating adding it.

Regards,

Olivier

> On Mon, Oct 2, 2017 at 11:18 PM, Olivier Houchard  
> wrote:
> > Hi,
> >
> > The attached patches add experimental support for 0-RTT with OpenSSL 1.1.1
> > They are based on Emmanuel's previous patches, so I'm submitting them again,
> > updated to reflect the changes in OpenSSL API, and with a few fixes.
> > To allow the use of early data, one has to explicitely add "allow-0rtt" to
> > its bind line. If early data are provided by the client, a
> > "Early-Data: 1" header will be added, to let the origin server know that.
> >
> > Because early data have security implications, a new sample fetch was added,
> > "ssl_fc_has_early", a boolean that will be evaluated to true if early data
> > were provided, as well as new action, "wait-for-handshake", which will make
> > haproxy wait for the completion of the SSL handshake before processing the
> > request. After the handshake, early data are considered as normal data, and
> > they won't be reported to the origin server.
> >
> > As usual, bugs are to be expected, and any review and/or test will be
> > appreciated.
> >
> > Regards,
> >
> > Olivier
> 



Re: Reload takes about 3 minutes

2017-10-13 Thread Olivier Houchard
Hi Joel,

On Fri, Oct 13, 2017 at 03:22:56PM +0200, Joel W Kall wrote:
> Got some results from strace. Running the reload with sudo takes about 3
> minutes and shows that it spends most of the time on:
> 
> 14:39:38.077925 poll([{fd=6, events=POLLIN}], 1, -1) = ?
> ERESTART_RESTARTBLOCK (Interrupted by signal) <180.434742>
> 
> But running without sudo it's a bit faster (about 2 minutes) and it spends
> a lot of time on multiple calls like this (note the 5 second timeout on the
> last row):
> 
> 14:47:35.040118 ioctl(4, FIONREAD, [149]) = 0 <0.11>
> 14:47:35.040156 recvfrom(4,
> "\201\t\201\200\0\1\0\1\0\5\0\0\10commodus\6loop54\2se\0"..., 1024, 0,
> {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("213.80.98.2")},
> [16]) = 149 <0.11>
> 14:47:35.040200 close(4)= 0 <0.14>
> 14:47:35.040313 brk(0x3ba7000)  = 0x3ba7000 <0.13>
> 14:47:35.040440 brk(0x3bcb000)  = 0x3bcb000 <0.12>
> 14:47:35.040542 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=198, ...}) = 0 <0.14>
> 14:47:35.040591 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=198, ...}) = 0 <0.13>
> 14:47:35.040635 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 4 <0.15>
> 14:47:35.040678 fstat(4, {st_mode=S_IFREG|0644, st_size=183, ...}) = 0
> <0.10>
> 14:47:35.040718 read(4, "127.0.0.1\tlocalhost\n127.0.1.1\tha"..., 4096) =
> 183 <0.11>
> 14:47:35.040760 read(4, "", 4096)   = 0 <0.10>
> 14:47:35.040795 close(4)= 0 <0.11>
> 14:47:35.040831 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=198, ...}) = 0 <0.13>
> 14:47:35.040876 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
> <0.14>
> 14:47:35.040914 connect(4, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("213.80.98.2")}, 16) = 0 <0.14>
> 14:47:35.040955 poll([{fd=4, events=POLLOUT}], 1, 0) = 1 ([{fd=4,
> revents=POLLOUT}]) <0.11>
> 14:47:35.041015 sendto(4,
> "o\5\1\0\0\1\0\0\0\0\0\0\4jaws\6loop54\2se\0\0\1\0\1", 32, MSG_NOSIGNAL,
> NULL, 0) = 32 <0.23>
> 14:47:35.041067 poll([{fd=4, events=POLLIN}], 1, 5000) = 0 (Timeout)
> <5.005048>
> 
> It seems to be doing this for each of our backends, but in some cases it
> gets a timeout which stalls the process for 5 seconds. On our other servers
> it seems to run the same calls but it doesn't timeout as often so the total
> time is much faster.
> 
> So some follow up questions:
> 
> 1. Does it try to connect to each backend when starting up and won't start
> until it has tested all of them? If so, can we disable this?

Apparently, it tries to do DNS resolutions, using 213.80.98.2 as a DNS
server, but the server doesn't seem to always answer, in this case it won't
answer for jaws.loop54.se, hence the timeout.
If it only happens on one server, maybe you have network issues, and it has
a hard time reaching the DNS server ?

Regards,

Olivier



[PATCH] Properly handle weight increase with consistent weight

2017-10-17 Thread Olivier Houchard
Hi,

When consistent hash is used, when the weight of the server is increased
beyond the original one, it currently has no effect (well really it will
reset the weight to the original one, if it was decreased), to fix this,
the attached patch allocates more nodes, and add them to the ebtree as needed.

Regards,

Olivier
>From a8d290e08d4820fe5058ba00fd4ef762e562cb69 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 17 Oct 2017 15:52:59 +0200
Subject: [PATCH] MINOR: server: Handle weight increase in consistent hash.

When the server weight is rised using the CLI, extra nodes have to be
allocated, or the weight will be effectively the same as the original one.
---
 src/lb_chash.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/lb_chash.c b/src/lb_chash.c
index 82124bc27..f368b684e 100644
--- a/src/lb_chash.c
+++ b/src/lb_chash.c
@@ -78,6 +78,25 @@ static inline void chash_queue_dequeue_srv(struct server *s)
eb32_delete(&s->lb_nodes[s->lb_nodes_now].node);
}
 
+   /* Attempt to increase the total number of nodes, if the user
+* increased the weight beyond the original weight
+*/
+   if (s->lb_nodes_tot < s->next_eweight) {
+   struct tree_occ *new_nodes = realloc(s->lb_nodes, 
s->next_eweight);
+
+   if (new_nodes) {
+   unsigned int j;
+
+   s->lb_nodes = new_nodes;
+   memset(&s->lb_nodes[s->lb_nodes_tot], 0,
+   (s->next_eweight - s->lb_nodes_tot) * 
sizeof(*s->lb_nodes));
+   for (j = s->lb_nodes_tot; j < s->next_eweight; j++) {
+   s->lb_nodes[j].server = s;
+   s->lb_nodes[j].node.key = full_hash(s->puid * 
SRV_EWGHT_RANGE + j);
+   }
+   s->lb_nodes_tot = s->next_eweight;
+   }
+   }
while (s->lb_nodes_now < s->next_eweight) {
if (s->lb_nodes_now >= s->lb_nodes_tot) // should always be 
false anyway
break;
-- 
2.13.5



[PATCH] checks: Add a keyword to specify the SNI in health checks

2017-10-17 Thread Olivier Houchard
Hi,

The attached patch adds a new keyword to servers, "check-sni", that lets you
specify which SNI to use when doing health checks over SSL.

Regards,

Olivier
>From 24779f0985041f4e680855d453a4bc5d096756f9 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 17 Oct 2017 17:33:43 +0200
Subject: [PATCH] MINOR: checks: Add a new keyword to specify a SNI when doing
 SSL checks.

Add a new keyword, "check-sni", to be able to specify the SNI to be used when
doing health checks over SSL.
---
 doc/configuration.txt  |  4 
 include/types/checks.h |  1 +
 src/checks.c   |  8 
 src/ssl_sock.c | 19 +++
 4 files changed, 32 insertions(+)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 934f87759..1421808b8 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -10970,6 +10970,10 @@ check-send-proxy
   "check-send-proxy" option needs to be used to force the use of the
   protocol. See also the "send-proxy" option for more information.
 
+check-sni
+  This option allows you to specify the SNI to be used when doing health checks
+  over SSL.
+
 check-ssl
   This option forces encryption of all health checks over SSL, regardless of
   whether the server uses SSL or not for the normal traffic. This is generally
diff --git a/include/types/checks.h b/include/types/checks.h
index 283ff3dbe..3559f2d52 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -184,6 +184,7 @@ struct check {
char **envp;/* the environment to use if 
running a process-based check */
struct pid_list *curpid;/* entry in pid_list used for 
current process-based test, or -1 if not in test */
struct sockaddr_storage addr;   /* the address to check */
+   char *sni;  /* Server name */
 };
 
 struct check_status {
diff --git a/src/checks.c b/src/checks.c
index c02935cf0..413365b24 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -60,6 +60,10 @@
 #include 
 #include 
 
+#ifdef USE_OPENSSL
+#include 
+#endif /* USE_OPENSSL */
+
 static int httpchk_expect(struct server *s, int done);
 static int tcpcheck_get_step_id(struct check *);
 static char * tcpcheck_get_step_comment(struct check *, int);
@@ -1597,6 +1601,10 @@ static int connect_conn_chk(struct task *t)
ret = SF_ERR_INTERNAL;
if (proto && proto->connect)
ret = proto->connect(conn, check->type, quickack ? 2 : 0);
+#ifdef USE_OPENSSL
+   if (s->check.sni)
+   ssl_sock_set_servername(conn, s->check.sni);
+#endif
if (s->check.send_proxy && !(check->state & CHK_ST_AGENT)) {
conn->send_proxy_ofs = 1;
conn->flags |= CO_FL_SEND_PROXY;
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 774a5a683..1d00b42e4 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -7075,6 +7075,24 @@ static int srv_parse_ca_file(char **args, int *cur_arg, 
struct proxy *px, struct
return 0;
 }
 
+/* parse the "check-sni" server keyword */
+static int srv_parse_check_sni(char **args, int *cur_arg, struct proxy *px, 
struct server *newsrv, char **err)
+{
+   if (!*args[*cur_arg + 1]) {
+   if (err)
+   memprintf(err, "'%s' : missing SNI", args[*cur_arg]);
+   return ERR_ALERT | ERR_FATAL;
+   }
+
+   newsrv->check.sni = strdup(args[*cur_arg + 1]);
+   if (!newsrv->check.sni) {
+   memprintf(err, "'%s' : failed to allocate memory", 
args[*cur_arg]);
+   return ERR_ALERT | ERR_FATAL;
+   }
+   return 0;
+
+}
+
 /* parse the "check-ssl" server keyword */
 static int srv_parse_check_ssl(char **args, int *cur_arg, struct proxy *px, 
struct server *newsrv, char **err)
 {
@@ -8031,6 +8049,7 @@ static struct bind_kw_list bind_kws = { "SSL", { }, {
  */
 static struct srv_kw_list srv_kws = { "SSL", { }, {
{ "ca-file", srv_parse_ca_file,1, 1 }, /* 
set CAfile to process verify server cert */
+   { "check-sni",   srv_parse_check_sni,  1, 1 }, /* 
set SNI */
{ "check-ssl",   srv_parse_check_ssl,  0, 1 }, /* 
enable SSL for health checks */
{ "ciphers", srv_parse_ciphers,1, 1 }, /* 
select the cipher suite */
{ "crl-file",srv_parse_crl_file,   1, 1 }, /* 
set certificate revocation list file use on server cert verify */
-- 
2.13.5



[PATCH] Reset a few more counters on "clear counters"

2017-10-18 Thread Olivier Houchard
Hi,

A few counters (namely, MaxSslRate, SslFrontendMaxKeyRate, and 
SslBackendMaxKeyRate) are not cleared as I think they should, when clear
counters is used.
The attached patch addresses that.

Regards,

Olivier
>From d90baef4715024e50d9596bd1410b8ea03ae1ed9 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 17 Oct 2017 19:23:25 +0200
Subject: [PATCH] MINOR: stats: Clear a bit more counters with in
 cli_parse_clear_counters().

Clear MaxSslRate, SslFrontendMaxKeyRate and SslBackendMaxKeyRate when
clear counters is used, it was probably forgotten when those counters were
added.
---
 src/stats.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/stats.c b/src/stats.c
index 6cbda22f1..25d6e65ff 100644
--- a/src/stats.c
+++ b/src/stats.c
@@ -3579,6 +3579,9 @@ static int cli_parse_clear_counters(char **args, struct 
appctx *appctx, void *pr
 
global.cps_max = 0;
global.sps_max = 0;
+   global.ssl_max = 0;
+   global.ssl_fe_keys_max = 0;
+   global.ssl_be_keys_max = 0;
return 1;
 }
 
-- 
2.13.5



Re: [PATCH] MINOR: server: missing chunck allocation in srv_update_status()

2017-10-24 Thread Olivier Houchard
On Tue, Oct 24, 2017 at 05:37:42PM +0200, Baptiste wrote:
> Hi,
> 
> While testing Christopher's DNS "thread-safe" code, I found a bug in
> srv_update_status following a recent update (related to threads too).
> 
> The patch is in attachment.


Ah you beat me at it ! I ran in the exact same issue. However my patch
also releases tmptrash, so I join it anyway, for great justice.

Regards,

Olivier



[PATCH] MINOR: Fix checks when connect_conn_chk() fails srv_update_status()

2017-10-24 Thread Olivier Houchard
Hi,

When checks were modified to dynamically allocate a connection, in case
connect_conn_chk() fails, the connection was never free'd, which made the
check unable to run again.

The attached patch addresses this.

Regards,

Olivier
>From 5267b95d7f196d1893d73035abe850226601284c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 24 Oct 2017 19:03:30 +0200
Subject: [PATCH 2/2] BUG/MINOR: checks: Don't forget to release the connection
 on error case.

When switching the check code to a non-permanent connection, the new code
forgot to free the connection if an error happened and was returned by
connect_conn_chk(), leading to the check never be ran again.
---
 src/checks.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/checks.c b/src/checks.c
index e6cc42a1c..880b47bc1 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -2118,6 +2118,10 @@ static struct task *process_chk_conn(struct task *t)
}
 
/* here, we have seen a synchronous error, no fd was allocated 
*/
+   if (conn) {
+   conn_free(conn);
+   check->conn = conn = NULL;
+   }
 
check->state &= ~CHK_ST_INPROGRESS;
check_notify_failure(check);
-- 
2.13.5



Re: [PATCH] MINOR: server: missing chunck allocation in srv_update_status()

2017-10-24 Thread Olivier Houchard
On Tue, Oct 24, 2017 at 07:12:15PM +0200, Olivier Houchard wrote:
> On Tue, Oct 24, 2017 at 05:37:42PM +0200, Baptiste wrote:
> > Hi,
> > 
> > While testing Christopher's DNS "thread-safe" code, I found a bug in
> > srv_update_status following a recent update (related to threads too).
> > 
> > The patch is in attachment.
> 
> 
> Ah you beat me at it ! I ran in the exact same issue. However my patch
> also releases tmptrash, so I join it anyway, for great justice.
> 
> Regards,
> 

Of course I forgot to attach the patch, here it comes.

Olivier
>From c6911d42df6f1938beb30f37919327ca4fc31b02 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 24 Oct 2017 17:42:47 +0200
Subject: [PATCH 1/2] BUG/MINOR: server: Allocate tmptrash before using it.

Don't forget to allocate tmptrash before using it, and free it once we're
done.
---
 src/server.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/server.c b/src/server.c
index f57434a29..844d6963b 100644
--- a/src/server.c
+++ b/src/server.c
@@ -4655,16 +4655,21 @@ void srv_update_status(struct server *s)
}
 
if (s->cur_state == SRV_ST_STOPPED) {   /* server was already 
down */
-   chunk_printf(tmptrash,
-"%sServer %s/%s was DOWN and now enters 
maintenance%s%s%s",
-s->flags & SRV_F_BACKUP ? "Backup " : "", 
s->proxy->id, s->id,
-*(s->adm_st_chg_cause) ? " (" : "", 
s->adm_st_chg_cause, *(s->adm_st_chg_cause) ? ")" : "");
+   tmptrash = alloc_trash_chunk();
+   if (tmptrash) {
+   chunk_printf(tmptrash,
+   "%sServer %s/%s was DOWN and now enters 
maintenance%s%s%s",
+   s->flags & SRV_F_BACKUP ? "Backup " : "", 
s->proxy->id, s->id,
+   *(s->adm_st_chg_cause) ? " (" : "", 
s->adm_st_chg_cause, *(s->adm_st_chg_cause) ? ")" : "");
 
-   srv_append_status(tmptrash, s, NULL, -1, (s->next_admin 
& SRV_ADMF_FMAINT));
+   srv_append_status(tmptrash, s, NULL, -1, 
(s->next_admin & SRV_ADMF_FMAINT));
 
-   if (!(global.mode & MODE_STARTING)) {
-   Warning("%s.\n", tmptrash->str);
-   send_log(s->proxy, LOG_NOTICE, "%s.\n", 
tmptrash->str);
+   if (!(global.mode & MODE_STARTING)) {
+   Warning("%s.\n", tmptrash->str);
+   send_log(s->proxy, LOG_NOTICE, "%s.\n", 
tmptrash->str);
+   }
+   free_trash_chunk(tmptrash);
+   tmptrash = NULL;
}
}
else {  /* server was still running */
-- 
2.13.5



Re: [PATCH] support Openssl 1.1.1 early callback API for HS

2017-10-25 Thread Olivier Houchard
Hi Emmanuel,

On Wed, Oct 25, 2017 at 02:37:58PM +0200, Emmanuel Hocdet wrote:
> Hi,
> 
> . patches serie rebase from master 
> . update openssl 1.1.1 api calls with new early callback name
> (https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_client_hello_cb.html 
> )
> 

That mostly looks like the version I maintained, except :
-   if (!SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name))
+   if (!SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name) || 
!s->generate_certs)

Shouldn't that be && !s->generate_certs ? Or we'll return SSL_TLSEXT_ERR_NOACK
as soon as we don't generate certificates.

Regards,

Olivier



Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-26 Thread Olivier Houchard
et_addr_len(&conn->addr.to));
+   ssl_ctx = ssl_sock_get_generated_cert(key, bind_conf);
+   if (ssl_ctx) {
+   /* switch ctx */
+   SSL_set_SSL_CTX(ssl, ssl_ctx);
+   return 1;
+   }
+   }
+   return 0;
 }
 #endif /* !defined SSL_NO_GENERATE_CERTIFICATES */
 
@@ -1955,12 +1976,12 @@ static void ssl_sock_switchctx_set(SSL *ssl, SSL_CTX 
*ctx)
 
 static int ssl_sock_switchctx_err_cbk(SSL *ssl, int *al, void *priv)
 {
+   struct bind_conf *s = priv;
(void)al; /* shut gcc stupid warning */
-   (void)priv;
 
-   if (!SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name))
-   return SSL_TLSEXT_ERR_NOACK;
-   return SSL_TLSEXT_ERR_OK;
+   if (SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name) || 
s->generate_certs)
+   return SSL_TLSEXT_ERR_OK;
+   return SSL_TLSEXT_ERR_NOACK;
 }
 
 #ifdef OPENSSL_IS_BORINGSSL
@@ -2022,6 +2043,11 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *arg)
servername = extension_data;
servername_len = len;
} else {
+#if (!defined SSL_NO_GENERATE_CERTIFICATES)
+   if (s->generate_certs && 
ssl_sock_generate_certificate_from_conn(s, ssl)) {
+   return 1;
+   }
+#endif
/* without SNI extension, is the default_ctx (need 
SSL_TLSEXT_ERR_NOACK) */
if (!s->strict_sni) {
ssl_sock_switchctx_set(ssl, s->default_ctx);
@@ -2165,6 +2191,12 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *arg)
methodVersions[conf->ssl_methods.max].ssl_set_version(ssl, 
SET_MAX);
return 1;
}
+#if (!defined SSL_NO_GENERATE_CERTIFICATES)
+   if (s->generate_certs && ssl_sock_generate_certificate(trash.str, s, 
ssl)) {
+   /* switch ctx done in ssl_sock_generate_certificate */
+   return 1;
+   }
+#endif
if (!s->strict_sni) {
/* no certificate match, is the default_ctx */
ssl_sock_switchctx_set(ssl, s->default_ctx);
@@ -2199,22 +2231,8 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *priv)
servername = SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name);
if (!servername) {
 #if (!defined SSL_NO_GENERATE_CERTIFICATES)
-   if (s->generate_certs) {
-   struct connection *conn = SSL_get_app_data(ssl);
-   unsigned int key;
-   SSL_CTX *ctx;
-
-   conn_get_to_addr(conn);
-   if (conn->flags & CO_FL_ADDR_TO_SET) {
-   key = 
ssl_sock_generated_cert_key(&conn->addr.to, get_addr_len(&conn->addr.to));
-   ctx = ssl_sock_get_generated_cert(key, s);
-   if (ctx) {
-   /* switch ctx */
-   SSL_set_SSL_CTX(ssl, ctx);
-   return SSL_TLSEXT_ERR_OK;
-   }
-   }
-   }
+   if (s->generate_certs && 
ssl_sock_generate_certificate_from_conn(s, ssl))
+   return SSL_TLSEXT_ERR_OK;
 #endif
if (s->strict_sni)
return SSL_TLSEXT_ERR_ALERT_FATAL;
@@ -2247,10 +2265,8 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *priv)
}
if (!node || container_of(node, struct sni_ctx, name)->neg) {
 #if (!defined SSL_NO_GENERATE_CERTIFICATES)
-   SSL_CTX *ctx;
-   if (s->generate_certs &&
-   (ctx = ssl_sock_generate_certificate(servername, s, ssl))) {
-   /* switch ctx */
+   if (s->generate_certs && 
ssl_sock_generate_certificate(servername, s, ssl)) {
+   /* switch ctx done in ssl_sock_generate_certificate */
return SSL_TLSEXT_ERR_OK;
}
 #endif
@@ -3678,8 +3694,8 @@ ssl_sock_initial_ctx(struct bind_conf *bind_conf)
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #else
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_cbk);
-   SSL_CTX_set_tlsext_servername_arg(ctx, bind_conf);
 #endif
+   SSL_CTX_set_tlsext_servername_arg(ctx, bind_conf);
 #endif
return cfgerr;
 }
-- 
2.13.5

>From 7f8d3065d86a5177c3f664d4a2ed34e62ff60bb3 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 22 Sep 2017 18:26:28 +0200
Subject: [PATCH 4/6] ssl: Handle early data with OpenSSL 1.1.1

When compiled with Openssl >= 1.1.1, before attempting to do the handshake,
try to read any early data. If any early data is present, then we'll create
the session

Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-27 Thread Olivier Houchard
On Fri, Oct 27, 2017 at 11:22:15AM +0200, Emmanuel Hocdet wrote:
> Hi Olivier
> 
> > Le 27 oct. 2017 ?? 01:08, Olivier Houchard  a ??crit 
> > :
> > 
> > Hi,
> > 
> > You'll find attached updated patches, rebased on the latest master, and on
> > top of Emmanuel's latest patches (also attached for reference).
> > This version allows to enable 0RTT per SNI.
> > It unfortunately still can't send early data to servers, this may or may
> > not happen later.
> 
> why add BC_SSL_O_EARLY_DATA?
> the information could be set in  conf->default_ssl_conf->early_data like
> in the same manner as per certificat configuration.
> 

Hi Emmanuel,

Because in my experience, we don't always have a default_ssl_conf. We only
have one if a crt_list file is provided.

Regards,

Olivier



Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-27 Thread Olivier Houchard
On Fri, Oct 27, 2017 at 12:36:31PM +0200, Emmanuel Hocdet wrote:
> 
> > Le 27 oct. 2017 ?? 11:22, Emmanuel Hocdet  a ??crit :
> > 
> > Hi Olivier
> > 
> >> Le 27 oct. 2017 ?? 01:08, Olivier Houchard  a 
> >> ??crit :
> >> 
> >> Hi,
> >> 
> >> You'll find attached updated patches, rebased on the latest master, and on
> >> top of Emmanuel's latest patches (also attached for reference).
> >> This version allows to enable 0RTT per SNI.
> >> It unfortunately still can't send early data to servers, this may or may
> >> not happen later.
> > 
> > why add BC_SSL_O_EARLY_DATA?
> > the information could be set in  conf->default_ssl_conf->early_data like
> > in the same manner as per certificat configuration.
> 
> okay it???s bind_conf->ssl_conf.early_data   (my quick patch was not the good 
> one)
> 
> You add allow-0rtt in global ssl_options, originally is only for ssl options 
> (api) and i???m not sure
> it???s really necessary for this special feature.
> 

That indeed would do the trick, I'll change that.

Regards,

Olivier



Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-27 Thread Olivier Houchard
Hi,

On Fri, Oct 27, 2017 at 12:45:36PM +0200, Olivier Houchard wrote:
> On Fri, Oct 27, 2017 at 12:36:31PM +0200, Emmanuel Hocdet wrote:
> > 
> > > Le 27 oct. 2017 ?? 11:22, Emmanuel Hocdet  a ??crit :
> > > 
> > > Hi Olivier
> > > 
> > >> Le 27 oct. 2017 ?? 01:08, Olivier Houchard  a 
> > >> ??crit :
> > >> 
> > >> Hi,
> > >> 
> > >> You'll find attached updated patches, rebased on the latest master, and 
> > >> on
> > >> top of Emmanuel's latest patches (also attached for reference).
> > >> This version allows to enable 0RTT per SNI.
> > >> It unfortunately still can't send early data to servers, this may or may
> > >> not happen later.
> > > 
> > > why add BC_SSL_O_EARLY_DATA?
> > > the information could be set in  conf->default_ssl_conf->early_data like
> > > in the same manner as per certificat configuration.
> > 
> > okay it???s bind_conf->ssl_conf.early_data   (my quick patch was not the 
> > good one)
> > 
> > You add allow-0rtt in global ssl_options, originally is only for ssl 
> > options (api) and i???m not sure
> > it???s really necessary for this special feature.
> > 
> 
> That indeed would do the trick, I'll change that.
> 
> Regards,
> 
> Olivier

The attached patch does use the ssl_conf, instead of abusing ssl_options.
I also added a new field in global_ssl, I wasn't so sure about this, but
decided people may want to enable 0RTT globally.

Emmanuel, is this ok for you ?

Regards,

Olivier
>From 742a5598f1171a81b5fc58cc500382dc819fa579 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 27 Oct 2017 14:58:08 +0200
Subject: [PATCH] MINOR: ssl: Don't abuse ssl_options.

A bind_conf does contain a ssl_bind_conf, which already has a flag to know
if early data are activated, so use that, instead of adding a new flag in
the ssl_options field.
---
 doc/configuration.txt|  4 
 include/types/listener.h |  1 -
 src/ssl_sock.c   | 26 ++
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index cf0d69606..b63ceb408 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -847,6 +847,10 @@ resetenv [ ...]
   next line in the configuration file sees the new environment. See also
   "setenv", "presetenv", and "unsetenv".
 
+ssl-allow-0rtt
+  Allow using 0RTT on every listener. 0RTT is prone to various attacks, so be
+  sure to know the security implications before activating it.
+
 stats bind-process [ all | odd | even | [-] ] ...
   Limits the stats socket to a certain set of processes numbers. By default the
   stats socket is bound to all processes, causing a warning to be emitted when
diff --git a/include/types/listener.h b/include/types/listener.h
index 19d1dbe3b..bcebea810 100644
--- a/include/types/listener.h
+++ b/include/types/listener.h
@@ -105,7 +105,6 @@ enum li_state {
 #define BC_SSL_O_NONE   0x
 #define BC_SSL_O_NO_TLS_TICKETS 0x0100 /* disable session resumption tickets */
 #define BC_SSL_O_PREF_CLIE_CIPH 0x0200  /* prefer client ciphers */
-#define BC_SSL_O_EARLY_DATA 0x0400  /* Accept early data */
 #endif
 
 /* ssl "bind" settings */
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 844be0a0e..62fcd00bd 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -165,6 +165,7 @@ static struct {
char *crt_base; /* base directory path for certificates */
char *ca_base;  /* base directory path for CAs and CRLs */
int  async; /* whether we use ssl async mode */
+   int default_early_data; /* Shall we default to allow early data */
 
char *listen_default_ciphers;
char *connect_default_ciphers;
@@ -2009,7 +2010,7 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, void 
*arg)
conn = SSL_get_app_data(ssl);
s = objt_listener(conn->target)->bind_conf;
 
-   if (s->ssl_options & BC_SSL_O_EARLY_DATA)
+   if (s->ssl_conf.early_data)
allow_early = 1;
 #ifdef OPENSSL_IS_BORINGSSL
if (SSL_early_callback_ctx_extension_get(ctx, TLSEXT_TYPE_server_name,
@@ -6976,7 +6977,7 @@ static int ssl_bind_parse_allow_0rtt(char **args, int 
cur_arg, struct proxy *px,
 
 static int bind_parse_allow_0rtt(char **args, int cur_arg, struct proxy *px, 
struct bind_conf *conf, char **err)
 {
-   conf->ssl_options |= BC_SSL_O_EARLY_DATA;
+   conf->ssl_conf.early_data = 1;
return 0;
 }
 
@@ -7102,6 +7103,7 @@ static int bind_parse_ssl(char **args, int cur_arg, 
struct proxy *px, struct bin
conf->ssl_conf.ciphers = 
strdup(global_ssl.listen_default_c

Re: [PATCHES][ssl] Add 0-RTT support with OpenSSL 1.1.1

2017-10-31 Thread Olivier Houchard
On Fri, Oct 27, 2017 at 03:54:27PM +0200, Emmanuel Hocdet wrote:
> 
> > Le 27 oct. 2017 à 15:02, Olivier Houchard  a écrit :
> > 
> > The attached patch does use the ssl_conf, instead of abusing ssl_options.
> > I also added a new field in global_ssl, I wasn't so sure about this, but
> > decided people may want to enable 0RTT globally.
> > 
> > Emmanuel, is this ok for you ?
> > 
> 
> In global option seem a bad idea.
> 
> My opinion about global ssl ‘options’ for bind.
> . Good fit is in ssl-default-bind-options. It can be extend to more options 
> like
> generate-cert, strict-sni, ….
> (In this case have a kw_list will be good idea to have something better than 
> parsing in if/then/else
> in ssl_parse_default_bind_options)
> . Some options have already 2 locations for configuration (bind line and per 
> certificats), we really
> need a third? And some options are not really good candidate.
> 
> ++
> Manu
> 

Hi,

The attached patch removes the global ssl-allow-0rtt option.

Regards,

Olivier
>From 119a9c1b5324c4ef0636bc35d8e431a17c287076 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 31 Oct 2017 13:32:10 +0100
Subject: [PATCH] MINOR: ssl: Remove the global allow-0rtt option.

---
 doc/configuration.txt |  4 
 src/ssl_sock.c| 20 
 2 files changed, 24 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 8d0624839..67b888905 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -847,10 +847,6 @@ resetenv [ ...]
   next line in the configuration file sees the new environment. See also
   "setenv", "presetenv", and "unsetenv".
 
-ssl-allow-0rtt
-  Allow using 0RTT on every listener. 0RTT is prone to various attacks, so be
-  sure to know the security implications before activating it.
-
 stats bind-process [ all | odd | even | [-] ] ...
   Limits the stats socket to a certain set of processes numbers. By default the
   stats socket is bound to all processes, causing a warning to be emitted when
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 13d952652..7f52c4057 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -166,7 +166,6 @@ static struct {
char *crt_base; /* base directory path for certificates */
char *ca_base;  /* base directory path for CAs and CRLs */
int  async; /* whether we use ssl async mode */
-   int default_early_data; /* Shall we default to allow early data */
 
char *listen_default_ciphers;
char *connect_default_ciphers;
@@ -7325,7 +7324,6 @@ static int bind_parse_ssl(char **args, int cur_arg, 
struct proxy *px, struct bin
conf->ssl_conf.ciphers = 
strdup(global_ssl.listen_default_ciphers);
conf->ssl_options |= global_ssl.listen_default_ssloptions;
conf->ssl_conf.ssl_methods.flags |= 
global_ssl.listen_default_sslmethods.flags;
-   conf->ssl_conf.early_data = global_ssl.default_early_data;
if (!conf->ssl_conf.ssl_methods.min)
conf->ssl_conf.ssl_methods.min = 
global_ssl.listen_default_sslmethods.min;
if (!conf->ssl_conf.ssl_methods.max)
@@ -7819,23 +7817,6 @@ static int ssl_parse_global_ca_crt_base(char **args, int 
section_type, struct pr
return 0;
 }
 
-/* parse the "ssl-allow-0rtt" keyword in global section.
- * Returns <0 on alert, >0 on warning, 0 on success.
- */
-static int ssl_parse_global_ssl_allow_0rtt(char **args, int section_type,
-struct proxy *curpx, struct proxy *defpx, const char *file, int line,
-char **err)
-{
-#if (OPENSSL_VERSION_NUMBER >= 0x10101000L)
-global_ssl.default_early_data = 1;
-return 0;
-#else
-memprintf(err, "'%s': openssl library does not early data", args[0]);
-return -1;
-#endif
-
-}
-
 /* parse the "ssl-mode-async" keyword in global section.
  * Returns <0 on alert, >0 on warning, 0 on success.
  */
@@ -8526,7 +8507,6 @@ static struct cfg_kw_list cfg_kws = {ILH, {
{ CFG_GLOBAL, "ca-base",  ssl_parse_global_ca_crt_base },
{ CFG_GLOBAL, "crt-base", ssl_parse_global_ca_crt_base },
{ CFG_GLOBAL, "maxsslconn", ssl_parse_global_int },
-   { CFG_GLOBAL, "ssl-allow-0rtt", ssl_parse_global_ssl_allow_0rtt },
{ CFG_GLOBAL, "ssl-default-bind-options", 
ssl_parse_default_bind_options },
{ CFG_GLOBAL, "ssl-default-server-options", 
ssl_parse_default_server_options },
 #ifndef OPENSSL_NO_DH
-- 
2.13.5



[PATCH] Fix SRV records again

2017-10-31 Thread Olivier Houchard
Hi,

The recently merged MT code broke SRV records by attempting to recursively
lock the DNS lock. The attached patch attemps to fix this by letting the
relevant code know if the lock is already held or not.

Regards,

Olivier
>From 4df4df3bc00ddf9a1e7b94188058a0d76816 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 31 Oct 2017 15:21:19 +0100
Subject: [PATCH] BUG/MINOR: dns: Fix SRV records with the new thread code.

srv_set_fqdn() may be called with the DNS lock already held, but tries to
lock it anyway. So, add a new parameter to let it know if it was already
locked or not;
---
 include/proto/server.h |  2 +-
 src/dns.c  |  2 +-
 src/server.c   | 21 -
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/include/proto/server.h b/include/proto/server.h
index ff4ec77ca..14f4926e2 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -56,7 +56,7 @@ extern struct list updated_servers;
 
 /* functions related to server name resolution */
 int snr_update_srv_status(struct server *s, int has_no_ip);
-const char *update_server_fqdn(struct server *server, const char *fqdn, const 
char *updater);
+const char *update_server_fqdn(struct server *server, const char *fqdn, const 
char *updater, int dns_locked);
 int snr_resolution_cb(struct dns_requester *requester, struct dns_nameserver 
*nameserver);
 int snr_resolution_error_cb(struct dns_requester *requester, int error_code);
 struct server *snr_check_ip_callback(struct server *srv, void *ip, unsigned 
char *ip_family);
diff --git a/src/dns.c b/src/dns.c
index 1fdc461e6..a8d468cf4 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -545,7 +545,7 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
if (dns_dn_label_to_str(item->target, 
item->data_len+1,
hostname, 
DNS_MAX_NAME_SIZE) == -1)
continue;
-   msg = update_server_fqdn(srv, hostname, "SRV 
record");
+   msg = update_server_fqdn(srv, hostname, "SRV 
record", 1);
if (msg)
send_log(srv->proxy, LOG_NOTICE, "%s", 
msg);
 
diff --git a/src/server.c b/src/server.c
index 2d0e3b4f9..37f90d8c9 100644
--- a/src/server.c
+++ b/src/server.c
@@ -53,7 +53,7 @@ HA_SPINLOCK_T updated_servers_lock;
 static void srv_register_update(struct server *srv);
 static void srv_update_state(struct server *srv, int version, char **params);
 static int srv_apply_lastaddr(struct server *srv, int *err_code);
-static int srv_set_fqdn(struct server *srv, const char *fqdn);
+static int srv_set_fqdn(struct server *srv, const char *fqdn, int dns_locked);
 
 /* List head of all known server keywords */
 static struct srv_kw_list srv_keywords = {
@@ -2911,7 +2911,7 @@ static void srv_update_state(struct server *srv, int 
version, char **params)
 * from stats socket).
 */
if (fqdn_set_by_cli) {
-   srv_set_fqdn(srv, fqdn);
+   srv_set_fqdn(srv, fqdn, 0);
srv->next_admin |= 
SRV_ADMF_HMAINT;
}
}
@@ -3764,13 +3764,14 @@ int srv_set_addr_via_libc(struct server *srv, int 
*err_code)
 /* Set the server's FDQN (->hostname) from .
  * Returns -1 if failed, 0 if not.
  */
-int srv_set_fqdn(struct server *srv, const char *hostname)
+int srv_set_fqdn(struct server *srv, const char *hostname, int dns_locked)
 {
struct dns_resolution *resolution;
char  *hostname_dn;
inthostname_len, hostname_dn_len;
 
-   SPIN_LOCK(DNS_LOCK, &srv->resolvers->lock);
+   if (!dns_locked)
+   SPIN_LOCK(DNS_LOCK, &srv->resolvers->lock);
/* run time DNS resolution was not active for this server
 * and we can't enable it at run time for now.
 */
@@ -3805,11 +3806,13 @@ int srv_set_fqdn(struct server *srv, const char 
*hostname)
goto err;
 
   end:
-   SPIN_UNLOCK(DNS_LOCK, &srv->resolvers->lock);
+   if (!dns_locked)
+   SPIN_UNLOCK(DNS_LOCK, &srv->resolvers->lock);
return 0;
 
   err:
-   SPIN_UNLOCK(DNS_LOCK, &srv->resolvers->lock);
+   if (!dns_locked)
+   SPIN_UNLOCK(DNS_LOCK, &srv->resolvers->lock);
return -1;
 }
 
@@ -3936,7 +3939,7 @@ int srv_init_addr(void)
return return_code;
 }
 
-const char *update_server_fqdn(struct server *server, const char *fqdn, const 
char *updater)
+const char *update_server_fqdn(struct serv

[PATCHES] TLS 1.3 session resumption and early data to servers

2017-11-03 Thread Olivier Houchard
Hi,

The attached patches makes TLS 1.3 session resumption work (it is a bit
different than the previous version, as the session is created after the
handshake), and enable sending early data to the server, as long as the
client used early data (we can't afford to send early data without knowing
if the client can handle a 425 too early, as that's the only thing we can
return if the server rejected the early data, as we no longer have them to
resend them).
Any feedback is welcome, as it's been mostly tested with only OpenSSL 1.1.1.

Regards,

Olivier
>From 7db328b4e5028a80c9817049108f5625513a87e8 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 2 Nov 2017 19:04:38 +0100
Subject: [PATCH 1/4] BUG/MINOR; ssl: Don't assume we have a ssl_bind_conf
 because a SNI is matched.

We only have a ssl_bind_conf if crt-list is used, however we can still
match a certificate SNI, so don't assume we have a ssl_bind_conf.
---
 src/ssl_sock.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index b1d39dbbd..66930d220 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -2267,11 +2267,13 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
void *arg)
/* switch ctx */
struct ssl_bind_conf *conf = container_of(node, struct sni_ctx, 
name)->conf;
ssl_sock_switchctx_set(ssl, container_of(node, struct sni_ctx, 
name)->ctx);
-   methodVersions[conf->ssl_methods.min].ssl_set_version(ssl, 
SET_MIN);
-   methodVersions[conf->ssl_methods.max].ssl_set_version(ssl, 
SET_MAX);
-   if (conf->early_data)
-   allow_early = 1;
-   goto allow_early;
+   if (conf) {
+   
methodVersions[conf->ssl_methods.min].ssl_set_version(ssl, SET_MIN);
+   
methodVersions[conf->ssl_methods.max].ssl_set_version(ssl, SET_MAX);
+   if (conf->early_data)
+   allow_early = 1;
+   }
+   goto allow_early;
}
 #if (!defined SSL_NO_GENERATE_CERTIFICATES)
if (s->generate_certs && ssl_sock_generate_certificate(trash.str, s, 
ssl)) {
-- 
2.13.0

>From c2728dd7ee527c1eb99588b17eb4f917d622fef1 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 3 Nov 2017 13:43:35 +0100
Subject: [PATCH 2/4] MINOR: ssl: Handle session resumption with TLS 1.3

With TLS 1.3, session aren't established until after the main handshake
has completed. So we can't just rely on calling SSL_get1_session(). Instead,
we now register a callback for the "new session" event. This should work for
previous versions of TLS as well.
---
 src/ssl_sock.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 66930d220..1dd3a4488 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -3852,6 +3852,23 @@ static int sh_ssl_sess_store(unsigned char *s_id, 
unsigned char *data, int data_
return 1;
 }
 
+/* SSL callback used when a new session is created while connecting to a 
server */
+static int ssl_sess_new_srv_cb(SSL *ssl, SSL_SESSION *sess)
+{
+   struct connection *conn = SSL_get_app_data(ssl);
+
+   /* check if session was reused, if not store current session on server 
for reuse */
+   if (objt_server(conn->target)->ssl_ctx.reused_sess[tid]) {
+   
SSL_SESSION_free(objt_server(conn->target)->ssl_ctx.reused_sess[tid]);
+   objt_server(conn->target)->ssl_ctx.reused_sess[tid] = NULL;
+   }
+
+   if (!(objt_server(conn->target)->ssl_ctx.options & SRV_SSL_O_NO_REUSE))
+   objt_server(conn->target)->ssl_ctx.reused_sess[tid] = 
SSL_get1_session(conn->xprt_ctx);
+
+   return 1;
+}
+
 /* SSL callback used on new session creation */
 int sh_ssl_sess_new_cb(SSL *ssl, SSL_SESSION *sess)
 {
@@ -4580,7 +4597,9 @@ int ssl_sock_prepare_srv_ctx(struct server *srv)
 #endif
}
 
-   SSL_CTX_set_session_cache_mode(srv->ssl_ctx.ctx, SSL_SESS_CACHE_OFF);
+   SSL_CTX_set_session_cache_mode(srv->ssl_ctx.ctx, SSL_SESS_CACHE_CLIENT |
+   SSL_SESS_CACHE_NO_INTERNAL_STORE);
+   SSL_CTX_sess_set_new_cb(srv->ssl_ctx.ctx, ssl_sess_new_srv_cb);
if (srv->ssl_ctx.ciphers &&
!SSL_CTX_set_cipher_list(srv->ssl_ctx.ctx, 
srv->ssl_ctx.ciphers)) {
Alert("Proxy '%s', server '%s' [%s:%d] : unable to set SSL 
cipher list to '%s'.\n",
@@ -5208,15 +5227,6 @@ reneg_ok:
update_freq_ctr(&global.ssl_be_keys_per_sec, 1);
if (global.ssl_be_keys_per_sec.curr_ctr > 
global.ssl_be_keys_max)
  

[PATCH] Fix SRV records again

2017-11-06 Thread Olivier Houchard
Hi,

The attached patch fixes a locking issue that prevented SRV records from
working.

Regards,

Olivier

>From 109dfc4075132881d4330f26d437dc8725a608dd Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 6 Nov 2017 15:15:04 +0100
Subject: [PATCH] BUG/MINOR: dns: Don't try to get the server lock if it's
 already held.

dns_link_resolution() can be called with the server lock already held, so
don't attempt to lock it again in that case.
---
 include/proto/dns.h |  2 +-
 src/dns.c   | 19 ---
 src/server.c|  2 +-
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/proto/dns.h b/include/proto/dns.h
index c3e3846dc..3ad79c3a4 100644
--- a/include/proto/dns.h
+++ b/include/proto/dns.h
@@ -40,7 +40,7 @@ int dns_get_ip_from_response(struct dns_response_packet 
*dns_p,
  void **newip, short *newip_sin_family,
  void *owner);
 
-int dns_link_resolution(void *requester, int requester_type);
+int dns_link_resolution(void *requester, int requester_type, int 
requester_locked);
 void dns_unlink_resolution(struct dns_requester *requester);
 void dns_trigger_resolution(struct dns_requester *requester);
 
diff --git a/src/dns.c b/src/dns.c
index 6b8746090..1d12c8421 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -550,8 +550,10 @@ static void dns_check_dns_response(struct dns_resolution 
*res)
char hostname[DNS_MAX_NAME_SIZE];
 
if (dns_dn_label_to_str(item->target, 
item->data_len+1,
-   hostname, 
DNS_MAX_NAME_SIZE) == -1)
+   hostname, 
DNS_MAX_NAME_SIZE) == -1) {
+   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
continue;
+   }
msg = update_server_fqdn(srv, hostname, "SRV 
record", 1);
if (msg)
send_log(srv->proxy, LOG_NOTICE, "%s", 
msg);
@@ -1307,7 +1309,7 @@ static void dns_free_resolution(struct dns_resolution 
*resolution)
 /* Links a requester (a server or a dns_srvrq) with a resolution. It returns 0
  * on success, -1 otherwise.
  */
-int dns_link_resolution(void *requester, int requester_type)
+int dns_link_resolution(void *requester, int requester_type, int 
requester_locked)
 {
struct dns_resolution *res = NULL;
struct dns_requester  *req;
@@ -1345,10 +1347,12 @@ int dns_link_resolution(void *requester, int 
requester_type)
goto err;
 
if (srv) {
-   SPIN_LOCK(SERVER_LOCK, &srv->lock);
+   if (!requester_locked)
+   SPIN_LOCK(SERVER_LOCK, &srv->lock);
if (srv->dns_requester == NULL) {
if ((req = calloc(1, sizeof(*req))) == NULL) {
-   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
+   if (!requester_locked)
+   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
goto err;
}
req->owner = &srv->obj_type;
@@ -1356,7 +1360,8 @@ int dns_link_resolution(void *requester, int 
requester_type)
}
else
req = srv->dns_requester;
-   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
+   if (!requester_locked)
+   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
}
else if (srvrq) {
if (srvrq->dns_requester == NULL) {
@@ -1905,14 +1910,14 @@ static int dns_finalize_config(void)
 
if (srv->srvrq && !srv->srvrq->resolvers) {
srv->srvrq->resolvers = srv->resolvers;
-   if (dns_link_resolution(srv->srvrq, 
OBJ_TYPE_SRVRQ) == -1) {
+   if (dns_link_resolution(srv->srvrq, 
OBJ_TYPE_SRVRQ, 0) == -1) {
Alert("config : %s '%s' : unable to set 
DNS resolution for server '%s'.\n",
  proxy_type_str(px), px->id, 
srv->id);
err_code |= (ERR_ALERT|ERR_ABORT);
continue;
}
}
-   if (dns_link_resolution(srv, OBJ_TYPE_SERVER) == -1) {
+   if (dns_link_resolution(srv, OBJ_TYPE_SERVER, 0) == -1) 
{
Alert("config : %s '%s', unable to set DNS 
resolution for server '%s'.\n",
  

Re: [PATCH] Fix SRV records again

2017-11-06 Thread Olivier Houchard
On Mon, Nov 06, 2017 at 03:19:25PM +0100, Olivier Houchard wrote:
> Hi,
> 
> The attached patch fixes a locking issue that prevented SRV records from
> working.
> 
> Regards,
> 
> Olivier
> 


And another one, that fix a deadlock that occurs when checks trigger DNs
resolutoin.

Regards,

Olivier
>From 3cedd71b5338f8689004837cdcaa0ae42e48e39c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 6 Nov 2017 17:30:28 +0100
Subject: [PATCH] BUG/MINOR: dns: Don't lock the server lock in
 snr_check_ip_callback().

snr_check_ip_callback() may be called with the server lock, so don't attempt
to lock it again, instead, make sure the callers always have the lock before
calling it.
---
 src/dns.c|  6 ++
 src/server.c | 10 ++
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/dns.c b/src/dns.c
index 1d12c8421..0f93f3ce5 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -1614,7 +1614,13 @@ static void dns_resolve_recv(struct dgram_conn *dgram)
 * from the cache */
tmpns = ns;
list_for_each_entry(req, &res->requesters, list) {
+   struct server *s = objt_server(req->owner);
+
+   if (s)
+   SPIN_LOCK(SERVER_LOCK, &s->lock);
req->requester_cb(req, tmpns);
+   if (s)
+   SPIN_UNLOCK(SERVER_LOCK, &s->lock);
tmpns = NULL;
}
 
diff --git a/src/server.c b/src/server.c
index adc9fd40c..1a78fb334 100644
--- a/src/server.c
+++ b/src/server.c
@@ -3589,6 +3589,8 @@ int snr_update_srv_status(struct server *s, int has_no_ip)
  * returns:
  *  0 on error
  *  1 when no error or safe ignore
+ *
+ * Must be called with server lock held
  */
 int snr_resolution_cb(struct dns_requester *requester, struct dns_nameserver 
*nameserver)
 {
@@ -3694,7 +3696,9 @@ int snr_resolution_error_cb(struct dns_requester 
*requester, int error_code)
s = objt_server(requester->owner);
if (!s)
return 1;
+   SPIN_LOCK(SERVER_LOCK, &s->lock);
snr_update_srv_status(s, 0);
+   SPIN_UNLOCK(SERVER_LOCK, &s->lock);
return 1;
 }
 
@@ -3703,6 +3707,8 @@ int snr_resolution_error_cb(struct dns_requester 
*requester, int error_code)
  * which owns  and is up.
  * It returns a pointer to the first server found or NULL if  is not yet
  * assigned.
+ *
+ * Must be called with server lock held
  */
 struct server *snr_check_ip_callback(struct server *srv, void *ip, unsigned 
char *ip_family)
 {
@@ -3712,8 +3718,6 @@ struct server *snr_check_ip_callback(struct server *srv, 
void *ip, unsigned char
if (!srv)
return NULL;
 
-   SPIN_LOCK(SERVER_LOCK, &srv->lock);
-
be = srv->proxy;
for (tmpsrv = be->srv; tmpsrv; tmpsrv = tmpsrv->next) {
/* we found the current server is the same, ignore it */
@@ -3751,13 +3755,11 @@ struct server *snr_check_ip_callback(struct server 
*srv, void *ip, unsigned char
 (tmpsrv->addr.ss_family == AF_INET6 &&
  memcmp(ip, &((struct sockaddr_in6 
*)&tmpsrv->addr)->sin6_addr, 16) == 0))) {
SPIN_UNLOCK(SERVER_LOCK, &tmpsrv->lock);
-   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
return tmpsrv;
}
SPIN_UNLOCK(SERVER_LOCK, &tmpsrv->lock);
}
 
-   SPIN_UNLOCK(SERVER_LOCK, &srv->lock);
 
return NULL;
 }
-- 
2.13.5



[PATCHES] Fix TLS 1.3 session resumption, and 0RTT with threads.

2017-11-16 Thread Olivier Houchard
Hi,

The first patch attempts fo fix session resumption with TLS 1.3, when
haproxy acts as a client, by storing the ASN1-encoded session in the struct
server, instead of storing the SSL_SESSION *directly. Directly keeping
SSL_SESSION doesn't seem to work well when concurrent connections are made
using the same session.
The second patch tries to make sure the SSL handshake is done before calling
the shutw method. Not doing so may be result in getting errors, which
ultimately leads to the client connection being closed, when it shouldn't be.
This mostly happens when more than 1 thread is used.

Regards,

Olivier
>From e32a831c1cbff1fcfb66565273ec98052f3a7f79 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 16 Nov 2017 17:42:52 +0100
Subject: [PATCH 1/2] MINOR: SSL: Store the ASN1 representation of client
 sessions.

Instead of storing the SSL_SESSION pointer directly in the struct server,
store the ASN1 representation, otherwise, session resumption is broken with
TLS 1.3, when multiple outgoing connections want to use the same session.
---
 include/types/server.h |  6 -
 src/ssl_sock.c | 60 --
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/types/server.h b/include/types/server.h
index 76225f7d3..92dcdbb5c 100644
--- a/include/types/server.h
+++ b/include/types/server.h
@@ -274,7 +274,11 @@ struct server {
char *sni_expr; /* Temporary variable to store a sample 
expression for SNI */
struct {
SSL_CTX *ctx;
-   SSL_SESSION **reused_sess;
+   struct {
+   unsigned char *ptr;
+   int size;
+   int allocated_size;
+   } * reused_sess;
char *ciphers;  /* cipher suite to use if 
non-null */
int options;/* ssl options */
struct tls_version_filter methods;  /* ssl methods */
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 72d9b8aee..1162aa318 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -3856,19 +3856,35 @@ static int sh_ssl_sess_store(unsigned char *s_id, 
unsigned char *data, int data_
 static int ssl_sess_new_srv_cb(SSL *ssl, SSL_SESSION *sess)
 {
struct connection *conn = SSL_get_app_data(ssl);
+   struct server *s;
 
-   /* check if session was reused, if not store current session on server 
for reuse */
-   if (objt_server(conn->target)->ssl_ctx.reused_sess[tid]) {
-   
SSL_SESSION_free(objt_server(conn->target)->ssl_ctx.reused_sess[tid]);
-   objt_server(conn->target)->ssl_ctx.reused_sess[tid] = NULL;
-   }
+   s = objt_server(conn->target);
 
-   if (!(objt_server(conn->target)->ssl_ctx.options & SRV_SSL_O_NO_REUSE))
-   objt_server(conn->target)->ssl_ctx.reused_sess[tid] = 
SSL_get1_session(conn->xprt_ctx);
+   if (!(s->ssl_ctx.options & SRV_SSL_O_NO_REUSE)) {
+   int len;
+   unsigned char *ptr;
 
-   return 1;
+   len = i2d_SSL_SESSION(sess, NULL);
+   if (s->ssl_ctx.reused_sess[tid].ptr && 
s->ssl_ctx.reused_sess[tid].allocated_size >= len) {
+   ptr = s->ssl_ctx.reused_sess[tid].ptr;
+   } else {
+   free(s->ssl_ctx.reused_sess[tid].ptr);
+   ptr = s->ssl_ctx.reused_sess[tid].ptr = malloc(len);
+   s->ssl_ctx.reused_sess[tid].allocated_size = len;
+   }
+   if (s->ssl_ctx.reused_sess[tid].ptr) {
+   s->ssl_ctx.reused_sess[tid].size = i2d_SSL_SESSION(sess,
+   &ptr);
+   }
+   } else {
+   free(s->ssl_ctx.reused_sess[tid].ptr);
+   s->ssl_ctx.reused_sess[tid].ptr = NULL;
+   }
+
+   return 0;
 }
 
+
 /* SSL callback used on new session creation */
 int sh_ssl_sess_new_cb(SSL *ssl, SSL_SESSION *sess)
 {
@@ -4434,7 +4450,7 @@ int ssl_sock_prepare_srv_ctx(struct server *srv)
 
/* Initiate SSL context for current server */
if (!srv->ssl_ctx.reused_sess) {
-   if ((srv->ssl_ctx.reused_sess = calloc(1, 
global.nbthread*sizeof(SSL_SESSION*))) == NULL) {
+   if ((srv->ssl_ctx.reused_sess = calloc(1, 
global.nbthread*sizeof(*srv->ssl_ctx.reused_sess))) == NULL) {
Alert("Proxy '%s', server '%s' [%s:%d] out of 
memory.\n",
  curproxy->id, srv->id,
  srv->conf.file, srv->conf.line);
@@ -4923,10 +4939,15 @@ static int ssl_sock_init(struct connection *conn)
}
 
SSL_set_connect_state(conn->xprt_ctx);
-   if 

[PATCH] do the handshake if we can't send early data

2017-11-22 Thread Olivier Houchard
Hi,

We mistakely only try to go back to the SSL handshake when not able to send
early data if we're acting as a client, that is wrong, and leads to an
infinite loop if it happens on the server side.
The attached patch should fix this.

Regards,

Olivier
>From 2c011f4bfa515495c47c2495510ee01b199d4a26 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 22 Nov 2017 17:38:37 +0100
Subject: [PATCH] BUG/MINOR: ssl: Always start the handshake if we can't send
 early data.

The current code only tries to do the handshake in case we can't send early
data if we're acting as a client, which is wrong, it has to be done on the
server side too, or we end up in an infinite loop.
---
 src/ssl_sock.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index d1977960c..b8793fce6 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -5514,10 +5514,8 @@ static int ssl_sock_from_buf(struct connection *conn, 
struct buffer *buf, int fl
if (try + conn->tmp_early_data > max_early) {
try -= (try + conn->tmp_early_data) - max_early;
if (try <= 0) {
-   if (objt_server(conn->target)) {
-   conn->flags &= 
~CO_FL_EARLY_SSL_HS;
-   conn->flags |= 
CO_FL_SSL_WAIT_HS | CO_FL_WAIT_L6_CONN;
-   }
+   conn->flags &= ~CO_FL_EARLY_SSL_HS;
+   conn->flags |= CO_FL_SSL_WAIT_HS | 
CO_FL_WAIT_L6_CONN;
break;
}
}
-- 
2.14.3



Re: [PATCH] do the handshake if we can't send early data

2017-11-22 Thread Olivier Houchard
On Wed, Nov 22, 2017 at 05:42:42PM +0100, Olivier Houchard wrote:
> Hi,
> 
> We mistakely only try to go back to the SSL handshake when not able to send
> early data if we're acting as a client, that is wrong, and leads to an
> infinite loop if it happens on the server side.
> The attached patch should fix this.
> 

And a second patch on top of this one. We should not stop trying to read
early data until SSL_read_early_data() returns SSL_READ_EARLY_DATA_FINISH.

Regards,

Olivier
>From 3d4df5b19799c513a74caa744ac3d0df81467608 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 22 Nov 2017 19:12:10 +0100
Subject: [PATCH 2/2] MINOR: ssl: Don't disable early data handling if we could
 not write.

If we can't write early data, for some reason, don't give up on reading them,
they may still be early data to be read, and if we don't do so, openssl
internal states might be inconsistent, and the handshake will fail.
---
 src/ssl_sock.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index b8793fce6..24bb36877 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -5514,7 +5514,6 @@ static int ssl_sock_from_buf(struct connection *conn, 
struct buffer *buf, int fl
if (try + conn->tmp_early_data > max_early) {
try -= (try + conn->tmp_early_data) - max_early;
if (try <= 0) {
-   conn->flags &= ~CO_FL_EARLY_SSL_HS;
conn->flags |= CO_FL_SSL_WAIT_HS | 
CO_FL_WAIT_L6_CONN;
break;
}
-- 
2.14.3



[PATCH] ssl/mux: Handle early data with multiple streams

2017-11-23 Thread Olivier Houchard
Hi,

The attached patches attempt to make early data behave well when there are
multiple streams in one connection, ie with HTTP/2.
The first one makes sure we can mix read and write for early data. With a
single stream, we usually just read, and then write, with multiple streams,
we can read, handle the request, write, and read again data for a different
stream.
The second patch makes wait_for_handshake work for HTTP/2, by making sure
all the stream tasks are woken up once the handshake is done, in case some
of the streams were waiting for it.

Regards,

Olivier


>From cdb181d78466a1ce2be2b8b621231ba2086f4979 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 23 Nov 2017 18:21:29 +0100
Subject: [PATCH 1/2] MINOR: ssl: Handle reading early data after writing
 better.

It can happen that we want to read early data, write some, and then continue
reading them.
To do so, we can't reuse tmp_early_data to store the amount of data sent,
so introduce a new member.
If we read early data, then ssl_sock_to_buf() is now the only responsible
for getting back to the handshake, to make sure we don't miss any early data.
---
 include/proto/connection.h |  1 +
 include/types/connection.h |  1 +
 src/ssl_sock.c | 23 ++-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/include/proto/connection.h b/include/proto/connection.h
index 05a63fe6a..c68ae20ba 100644
--- a/include/proto/connection.h
+++ b/include/proto/connection.h
@@ -607,6 +607,7 @@ static inline void conn_init(struct connection *conn)
conn->obj_type = OBJ_TYPE_CONN;
conn->flags = CO_FL_NONE;
conn->tmp_early_data = -1;
+   conn->sent_early_data = 0;
conn->mux = NULL;
conn->mux_ctx = NULL;
conn->owner = NULL;
diff --git a/include/types/connection.h b/include/types/connection.h
index 88573b8a7..a9d04474d 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -373,6 +373,7 @@ struct connection {
void *owner;  /* pointer to the owner session for 
incoming connections, or NULL */
int xprt_st;  /* transport layer state, initialized to 
zero */
int tmp_early_data;   /* 1st byte of early data, if any */
+   int sent_early_data;  /* Amount of early data we sent so far */
union conn_handle handle; /* connection handle at the socket layer 
*/
enum obj_type *target;/* the target to connect to (server, 
proxy, applet, ...) */
struct list list; /* attach point to various connection 
lists (idle, ...) */
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 24bb36877..28f770664 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -1377,7 +1377,7 @@ void ssl_sock_infocbk(const SSL *ssl, int where, int ret)
 
if (where & SSL_CB_HANDSHAKE_START) {
/* Disable renegotiation (CVE-2009-3555) */
-   if ((conn->flags & (CO_FL_CONNECTED | CO_FL_EARLY_SSL_HS)) == 
CO_FL_CONNECTED) {
+   if ((conn->flags & (CO_FL_CONNECTED | CO_FL_EARLY_SSL_HS | 
CO_FL_EARLY_DATA)) == CO_FL_CONNECTED) {
conn->flags |= CO_FL_ERROR;
conn->err_code = CO_ER_SSL_RENEG;
}
@@ -5318,15 +5318,6 @@ static int ssl_sock_to_buf(struct connection *conn, 
struct buffer *buf, int coun
/* let's realign the buffer to optimize I/O */
if (buffer_empty(buf)) {
buf->p = buf->data;
-#if (OPENSSL_VERSION_NUMBER >= 0x10101000L)
-   /*
-* If we're done reading the early data, and we're using
-* a new buffer, then we know for sure we're not tainted
-* with early data anymore
-*/
-   if ((conn->flags & (CO_FL_EARLY_SSL_HS |CO_FL_EARLY_DATA)) == 
CO_FL_EARLY_DATA)
-   conn->flags &= ~CO_FL_EARLY_DATA;
-#endif
}
 
/* read the largest possible block. For this, we perform only one call
@@ -5499,9 +5490,6 @@ static int ssl_sock_from_buf(struct connection *conn, 
struct buffer *buf, int fl
if (!SSL_is_init_finished(conn->xprt_ctx)) {
unsigned int max_early;
 
-   if (conn->tmp_early_data == -1)
-   conn->tmp_early_data = 0;
-
if (objt_listener(conn->target))
max_early = 
SSL_get_max_early_data(conn->xprt_ctx);
else {
@@ -5511,17 +5499,18 @@ static int ssl_sock_from_buf(struct connection *conn, 
struct buffer *buf, int fl
max_early = 0;
}
 
-   if (try + conn->tmp_early_data > max_early) {
-   try -= (try + conn->tmp_early_data) - 

Re: [PATCH] MINOR: ssl: Handle early data with BoringSSL

2017-11-24 Thread Olivier Houchard
Hi Willy,

On Thu, Nov 23, 2017 at 07:44:13PM +0100, Willy Tarreau wrote:
> On Thu, Nov 23, 2017 at 04:16:39PM +0100, Emmanuel Hocdet wrote:
> > 
> > simplify patch:
> > no need to bypass post SSL_do_handshake process, only remove 
> > CO_FL_EARLY_SSL_HS
> > when handshake can't support early data.
> 
> Thanks Manu. Olivier just fixed some parts in the exact area by removing
> some code, so I'd feel more confident with him double-checking if this
> can be applied as-is or if it risks to reintroduce part of the issues he
> just fixed. So Olivier, please have a check and let me know.
> 

I think this is fine.

Regards,

Olivier



[PATCH] Rename the global variable "proxy" to "proxies_list" replace-header

2017-11-24 Thread Olivier Houchard
Hi,

The attached patch renames the global variable "proxy" to "proxies_list".
"proxy" is commonly used as a name for function arguments, and a number of
times, "proxy" was used when another variable was really meant. It worked
by luck, but can certainly come back to bite us at some point.

Regards,

Olivier
>From da26886c44f7bd9dff656c43498664fb3518775d Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 24 Nov 2017 16:54:05 +0100
Subject: [PATCH] MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"

Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.
---
 include/proto/proxy.h |  2 +-
 src/cache.c   |  2 +-
 src/cfgparse.c| 34 +-
 src/checks.c  |  4 ++--
 src/cli.c |  8 
 src/dns.c |  2 +-
 src/filters.c |  6 +++---
 src/flt_spoe.c|  2 +-
 src/haproxy.c | 14 +++---
 src/hlua_fcn.c|  6 +++---
 src/proto_http.c  | 16 
 src/proto_tcp.c   |  4 ++--
 src/proxy.c   | 17 -
 src/server.c  |  5 ++---
 src/stats.c   |  4 ++--
 src/stick_table.c |  2 +-
 16 files changed, 63 insertions(+), 65 deletions(-)

diff --git a/include/proto/proxy.h b/include/proto/proxy.h
index cb86159a9..d4a34a54f 100644
--- a/include/proto/proxy.h
+++ b/include/proto/proxy.h
@@ -31,7 +31,7 @@
 #include 
 #include 
 
-extern struct proxy *proxy;
+extern struct proxy *proxies_list;
 extern struct eb_root used_proxy_id;   /* list of proxy IDs in use */
 extern unsigned int error_snapshot_id;  /* global ID assigned to each error 
then incremented */
 extern struct eb_root proxy_by_name;/* tree of proxies sorted by name */
diff --git a/src/cache.c b/src/cache.c
index 7ae3267e5..f36ada238 100644
--- a/src/cache.c
+++ b/src/cache.c
@@ -840,7 +840,7 @@ int cfg_cache_postparser()
int err = 0;
struct flt_conf *fconf;
 
-   for (curproxy = proxy; curproxy; curproxy = curproxy->next) {
+   for (curproxy = proxies_list; curproxy; curproxy = curproxy->next) {
 
/* resolve the http response cache name to a ptr in the action 
rule */
list_for_each_entry(hresrule, &curproxy->http_res_rules, list) {
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 1a75686cb..9bebd644d 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -2747,8 +2747,8 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
}
 
init_new_proxy(curproxy);
-   curproxy->next = proxy;
-   proxy = curproxy;
+   curproxy->next = proxies_list;
+   proxies_list = curproxy;
curproxy->conf.args.file = curproxy->conf.file = strdup(file);
curproxy->conf.args.line = curproxy->conf.line = linenum;
curproxy->last_change = now.tv_sec;
@@ -7546,18 +7546,18 @@ int check_config_validity()
 
/* first, we will invert the proxy list order */
curproxy = NULL;
-   while (proxy) {
+   while (proxies_list) {
struct proxy *next;
 
-   next = proxy->next;
-   proxy->next = curproxy;
-   curproxy = proxy;
+   next = proxies_list->next;
+   proxies_list->next = curproxy;
+   curproxy = proxies_list;
if (!next)
break;
-   proxy = next;
+   proxies_list = next;
}
 
-   for (curproxy = proxy; curproxy; curproxy = curproxy->next) {
+   for (curproxy = proxies_list; curproxy; curproxy = curproxy->next) {
struct switching_rule *rule;
struct server_rule *srule;
struct sticking_rule *mrule;
@@ -8819,7 +8819,7 @@ out_uri_auth_compat:
}
 
/* Make each frontend inherit bind-process from its listeners when not 
specified. */
-   for (curproxy = proxy; curproxy; curproxy = curproxy->next) {
+   for (curproxy = proxies_list; curproxy; curproxy = curproxy->next) {
if (curproxy->bind_proc)
continue;
 
@@ -8849,14 +8849,14 @@ out_uri_auth_compat:
 * are any fatal errors as we must not call it with unresolved proxies.
 */
if (!cfgerr) {
-   for (curproxy = proxy; curproxy; curproxy = curproxy-&

Re: [PATCH] BUG/MINOR: ssl: fix CO_FL_EARLY_DATA removal with http mode

2017-11-27 Thread Olivier Houchard
Hi Emmanuel,

On Mon, Nov 27, 2017 at 05:17:54PM +0100, Emmanuel Hocdet wrote:
> 
> Hi,
> 
> This patch fix CO_FL_EARLY_DATA removal to have correct ssl_fc_has_early
> reporting. It work for 'mode http'.
> 
> It does not fix ssl_fc_has_early for 'mode tcp'. In this mode CO_FL_EARLY_DATA
> should not be removed if early data was accepted.
> It is possible to check MODE_TCP in mux_pt_recv? Or there is another way of 
> address this?
> 

The first patch seems fine.
The second breaks wait-for-handshake for me, I guess because recv() is called
before wake(), and so the flag is removed before that code in 
stream_interface.c :
/* If we had early data, and the handshake ended, then
 * we can remove the flag, and attempt to wake the task up,
 * in the event there's an analyser waiting for the end of
 * the handshake.
 */
if ((conn->flags & (CO_FL_EARLY_DATA | CO_FL_EARLY_SSL_HS)) == 
CO_FL_EARLY_DATA) {
task_wakeup(si_task(si), TASK_WOKEN_MSG);
}
So the stream task is never woken up.
Maybe the best is to add a new flag per conn_stream, CS_FL_WAITING_FOR_HS or
something, instead of relying on CO_FL_EARLY_DATA.
I think I'm going to do something like that.
That still doesn't help with your problem with TCP mode, though. I still want
the CO_FL_EARLY_DATA to be removed after the handshake, so that we don't
add the "Early-Data: 1" header if it is not needed. But it just occured
to me that I can easily fix that by adding the header, not only if 
CO_FL_EARLY_DATA is set, but if CO_FL_EARLY_SSL_HS or CO_FL_SSL_WAIT_HS is set.
That way we will both be happy :)

Regards,

Olivier




Re: [PATCH] BUG/MINOR: ssl: fix CO_FL_EARLY_DATA removal with http mode

2017-11-29 Thread Olivier Houchard
On Mon, Nov 27, 2017 at 06:19:41PM +0100, Emmanuel Hocdet wrote:
> > Maybe the best is to add a new flag per conn_stream, CS_FL_WAITING_FOR_HS or
> > something, instead of relying on CO_FL_EARLY_DATA.
> > I think I'm going to do something like that.
> 
> I think it's a good idea, two different things to deal with one tag.
> 
> > That still doesn't help with your problem with TCP mode, though. I still 
> > want
> > the CO_FL_EARLY_DATA to be removed after the handshake, so that we don't
> > add the "Early-Data: 1" header if it is not needed. But it just occured
> > to me that I can easily fix that by adding the header, not only if 
> > CO_FL_EARLY_DATA is set, but if CO_FL_EARLY_SSL_HS or CO_FL_SSL_WAIT_HS is 
> > set.
> > That way we will both be happy :)
> 

So, this if my first cut at it. It basically does as I said.
Only thing you may be unhappy with, I made the sample fetch ssl_fc_has_early
return 0 if we had early data but the handshake happened, because the main
use case is to know if there are early data and if they're a potential
security risk.
If you can give me a case where we have a need a sample fetch to know there
were early data, even after the handshake, maybe we can introduce a new
sample fetch, ssl_fc_has_insecure_early, or something ?

Regards,

Olivier
>From bda3b7800677184ea19fb81f75f9a9b44c79efeb Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 27 Nov 2017 18:41:32 +0100
Subject: [PATCH 1/2] MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake
 up streams.

Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake
up a stream, because it is waiting for a SSL handshake, instead add a new
conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on
CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.
---
 include/types/connection.h |  1 +
 src/mux_h2.c   | 24 ++--
 src/ssl_sock.c |  5 -
 src/stream_interface.c |  4 +++-
 4 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/include/types/connection.h b/include/types/connection.h
index a9d04474d..f220c42a9 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -69,6 +69,7 @@ enum {
 
CS_FL_ERROR = 0x0100,  /* a fatal error was reported */
CS_FL_EOS   = 0x1000,  /* End of stream */
+   CS_FL_WAIT_FOR_HS   = 0x0001,  /* This stream is waiting for 
handhskae */
 };
 
 /* cs_shutr() modes */
diff --git a/src/mux_h2.c b/src/mux_h2.c
index 4567b8ff0..4db90d4c8 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -54,6 +54,7 @@ static struct pool_head *pool_head_h2s;
 /* other flags */
 #define H2_CF_GOAWAY_SENT   0x0100  // a GOAWAY frame was successfully 
sent
 #define H2_CF_GOAWAY_FAILED 0x0200  // a GOAWAY frame failed to be sent
+#define H2_CF_WAIT_FOR_HS   0x0400  // We did check that at least a 
stream was waiting for handshake
 
 
 /* H2 connection state, in h2c->st0 */
@@ -2091,14 +2092,25 @@ static int h2_wake(struct connection *conn)
struct h2c *h2c = conn->mux_ctx;
 
/*
-* If we received early data, try to wake any stream, just in case
-* at least one of them was waiting for the handshake
+* If we received early data, and the handshake is done, wake
+* any stream that was waiting for it.
 */
-   if ((conn->flags & (CO_FL_EARLY_SSL_HS | CO_FL_EARLY_DATA | 
CO_FL_HANDSHAKE)) ==
-   CO_FL_EARLY_DATA) {
-   h2_wake_some_streams(h2c, 0, 0);
-   conn->flags &= ~CO_FL_EARLY_DATA;
+   if (!(h2c->flags & H2_CF_WAIT_FOR_HS) &&
+   (conn->flags & (CO_FL_EARLY_SSL_HS | CO_FL_HANDSHAKE | 
CO_FL_EARLY_DATA)) == CO_FL_EARLY_DATA) {
+   struct eb32_node *node;
+   struct h2s *h2s;
+
+   h2c->flags |= H2_CF_WAIT_FOR_HS;
+   node = eb32_lookup_ge(&h2c->streams_by_id, 1);
+
+   while (node) {
+   h2s = container_of(node, struct h2s, by_id);
+   if (h2s->cs->flags & CS_FL_WAIT_FOR_HS)
+   h2s->cs->data_cb->wake(h2s->cs);
+   node = eb32_next(node);
+   }
}
+
if (conn->flags & CO_FL_ERROR || conn_xprt_read0_pending(conn) ||
h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED ||
(eb_is_empty(&h2c->streams_by_id) && h2c->last_sid >= 0 &&
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index da1aecbcc..7c5fd6b10 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -8705,11 +8705,14 @@ enum act_return ssl_action_wait_for_hs(struct act_rule 
*rule, struct proxy *px,

Re: [PATCH] BUG/MINOR: ssl: fix CO_FL_EARLY_DATA removal with http mode

2017-11-30 Thread Olivier Houchard
Hi Emmanuel,

On Thu, Nov 30, 2017 at 12:15:37PM +0100, Emmanuel Hocdet wrote:
> Hi Olivier,
> 
> > Le 29 nov. 2017 à 19:57, Olivier Houchard  a écrit :
> > 
> > On Mon, Nov 27, 2017 at 06:19:41PM +0100, Emmanuel Hocdet wrote:
> >>> Maybe the best is to add a new flag per conn_stream, CS_FL_WAITING_FOR_HS 
> >>> or
> >>> something, instead of relying on CO_FL_EARLY_DATA.
> >>> I think I'm going to do something like that.
> >> 
> >> I think it's a good idea, two different things to deal with one tag.
> >> 
> >>> That still doesn't help with your problem with TCP mode, though. I still 
> >>> want
> >>> the CO_FL_EARLY_DATA to be removed after the handshake, so that we don't
> >>> add the "Early-Data: 1" header if it is not needed. But it just occured
> >>> to me that I can easily fix that by adding the header, not only if 
> >>> CO_FL_EARLY_DATA is set, but if CO_FL_EARLY_SSL_HS or CO_FL_SSL_WAIT_HS 
> >>> is set.
> >>> That way we will both be happy :)
> >> 
> > 
> > So, this if my first cut at it. It basically does as I said.
> > Only thing you may be unhappy with, I made the sample fetch ssl_fc_has_early
> > return 0 if we had early data but the handshake happened, because the main
> > use case is to know if there are early data and if they're a potential
> > security risk.
> > If you can give me a case where we have a need a sample fetch to know there
> > were early data, even after the handshake, maybe we can introduce a new
> > sample fetch, ssl_fc_has_insecure_early, or something ?
> > 
> 
> In this case, i don’t understand the interest of ssl_fc_has_early.
> 
> looking at the documentation
> ssl_fc_has_early : boolean
>   Returns true if early data were sent, and the handshake didn't happen yet. 
> As
>   it has security implications, it is useful to be able to refuse those, or
>   wait until the handshake happened.
> 
> ssl_fc_* can be used after the front connection at ssl level: handshake will 
> be finished at this time.
> ssl_fc_has_early should be: returns true if early data were sent and accepted 
> in ssl level. (425 return is http level)
> 
> What the description makes me think, and understand what you said, is that it 
> could be a « ssl_hs_has_early ».
> I’m very interesting to have acl on hs negotiation, i don't know how to do 
> that now in haproxy.
> 

No, sls_fc_has_early can be used before the handshake happened. If we
received early data, the request will be treated before the handshake (and
potentially the answer from the server will be sent before the handshake is
done). So ssl_fc_has_early is there to be able to stop any "dangerous" request.

Regards,

Olivier



Re: [PATCH] BUG/MINOR: ssl: fix CO_FL_EARLY_DATA removal with http mode

2017-11-30 Thread Olivier Houchard
On Thu, Nov 30, 2017 at 03:32:20PM +0100, Emmanuel Hocdet wrote:
> 
> > Le 30 nov. 2017 à 13:34, Olivier Houchard  a écrit :
> > 
> > Hi Emmanuel,
> > 
> > On Thu, Nov 30, 2017 at 12:15:37PM +0100, Emmanuel Hocdet wrote:
> >> Hi Olivier,
> >> 
> >>> Le 29 nov. 2017 à 19:57, Olivier Houchard  a écrit 
> >>> :
> >>> 
> >>> On Mon, Nov 27, 2017 at 06:19:41PM +0100, Emmanuel Hocdet wrote:
> >>>>> Maybe the best is to add a new flag per conn_stream, 
> >>>>> CS_FL_WAITING_FOR_HS or
> >>>>> something, instead of relying on CO_FL_EARLY_DATA.
> >>>>> I think I'm going to do something like that.
> >>>> 
> >>>> I think it's a good idea, two different things to deal with one tag.
> >>>> 
> >>>>> That still doesn't help with your problem with TCP mode, though. I 
> >>>>> still want
> >>>>> the CO_FL_EARLY_DATA to be removed after the handshake, so that we don't
> >>>>> add the "Early-Data: 1" header if it is not needed. But it just occured
> >>>>> to me that I can easily fix that by adding the header, not only if 
> >>>>> CO_FL_EARLY_DATA is set, but if CO_FL_EARLY_SSL_HS or CO_FL_SSL_WAIT_HS 
> >>>>> is set.
> >>>>> That way we will both be happy :)
> >>>> 
> >>> 
> >>> So, this if my first cut at it. It basically does as I said.
> >>> Only thing you may be unhappy with, I made the sample fetch 
> >>> ssl_fc_has_early
> >>> return 0 if we had early data but the handshake happened, because the main
> >>> use case is to know if there are early data and if they're a potential
> >>> security risk.
> >>> If you can give me a case where we have a need a sample fetch to know 
> >>> there
> >>> were early data, even after the handshake, maybe we can introduce a new
> >>> sample fetch, ssl_fc_has_insecure_early, or something ?
> >>> 
> >> 
> >> In this case, i don’t understand the interest of ssl_fc_has_early.
> >> 
> >> looking at the documentation
> >> ssl_fc_has_early : boolean
> >>  Returns true if early data were sent, and the handshake didn't happen 
> >> yet. As
> >>  it has security implications, it is useful to be able to refuse those, or
> >>  wait until the handshake happened.
> >> 
> >> ssl_fc_* can be used after the front connection at ssl level: handshake 
> >> will be finished at this time.
> >> ssl_fc_has_early should be: returns true if early data were sent and 
> >> accepted in ssl level. (425 return is http level)
> >> 
> >> What the description makes me think, and understand what you said, is that 
> >> it could be a « ssl_hs_has_early ».
> >> I’m very interesting to have acl on hs negotiation, i don't know how to do 
> >> that now in haproxy.
> >> 
> > 
> > No, sls_fc_has_early can be used before the handshake happened. If we
> > received early data, the request will be treated before the handshake (and
> > potentially the answer from the server will be sent before the handshake is
> > done). So ssl_fc_has_early is there to be able to stop any "dangerous" 
> > request.
> > 
> 
> My fault, i don’t read the doc of ssl_fc_has_early: as user it confused me on 
> the meaning
> and usage.
> What i mean is about the name of the prefix  ssl_fc*  ssl front connection.
> All ssl_fc_* can be used in log-format because is valid after the end of HS.
> ssl_fc_has_early is the first _fc_ with a different behavior and is not PoLA :
> ssl_fc_has_early can be used in http mode but it can false report in 
> log-format, and is useless in tcp mode? but it’s ssl_fc_* ...
> 
> Technically, with early HS, we have new informations who can changed after 
> the HS.
> I think such informations deserves a new prefix for sample.  ( ssl_fhs_* ?)
> For exemple, it could be used for acl in early HS (of course in http mode and 
> perhaps one day in tcp mode).
> 

Oh ok you're right, I picked up fc without much thought, I wasn't sure what
to use, and didn't bother ask, maybe indeed a new prefix would be wise.

Regards,

Olivier



[PATCH] Make thread affinity work on FreeBSD

2017-12-01 Thread Olivier Houchard

Hi,

The attached patch makes the call to pthread_setaffinity_np() work on
FreeBSD.

Regards,

Olivier
>From fc204ac3d7f9323b6583465ff5b42a0cfa46b8b1 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 1 Dec 2017 18:19:43 +0100
Subject: [PATCH] MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.

As with the call to cpuset_setaffinity(), FreeBSD expects the argument to
pthread_setaffinity_np() to be a cpuset_t, not an unsigned long, so the call
was silently failing.

This should probably be backported to 1.8.
---
 src/haproxy.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index a1fe550e1..4e3c82b86 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2900,10 +2900,24 @@ int main(int argc, char **argv)
global.cpu_map.thread[relative_pid-1][i] &= 
global.cpu_map.proc[relative_pid-1];
 
if (i < LONGBITS &&   /* only the first 32/64 
threads may be pinned */
-   global.cpu_map.thread[relative_pid-1][i]) /* only 
do this if the thread has a THREAD map */
+   global.cpu_map.thread[relative_pid-1][i]) {/* only 
do this if the thread has a THREAD map */
+#if defined(__FreeBSD__) || defined(__NetBSD__)
+   cpuset_t cpuset;
+#else
+   cpu_set_t cpuset;
+#endif
+   int j;
+   unsigned long cpu_map = 
global.cpu_map.thread[relative_pid-1][i];
+
+   CPU_ZERO(&cpuset);
+
+   while ((j = ffsl(cpu_map)) > 0) {
+   CPU_SET(j - 1, &cpuset);
+   cpu_map &= ~(1 << (j - 1));
+   }
pthread_setaffinity_np(threads[i],
-  sizeof(unsigned long),
-  (void 
*)&global.cpu_map.thread[relative_pid-1][i]);
+  sizeof(cpuset), &cpuset);
+   }
}
 #endif /* !USE_CPU_AFFINITY */
 
-- 
2.14.3



Re: Segfault with 1.8.0 build (RHEL5, old gcc).

2017-12-01 Thread Olivier Houchard
Hi Christopher,

On Fri, Dec 01, 2017 at 11:59:14AM -0800, Christopher Lane wrote:
> gist with backtrace, -vv output, and config file.  Also strace.
> 
> https://gist.github.com/jayalane/c6dbe7918aa9635b62c874d20f57dfec
> 
> It does all the listens and then right after the first epoll is done it has
> this segv.  all the local variables are "optimized out"
> 
> (We really want the hard-stop-after -- we get a leak of children with our
> super frequent soft-reloads).
> 
> It's possible something is messed up in my compiling/linking environment,
> but I don't see anything, and I thought maybe a .0 release had a chance of
> issues on very old RHEL versions.  We currently run 1.5 and have to
> manually clean up the left over children (but are hoping to scale up to
> nbproc 4 and use CPU pinning).
> 
> I'm debugging also, but hoped someone would have something off the top of
> their head (apologies for not having read the list for a while before
> posting).
> 

Thanks a lot for the very detailled report !
I think I know what's going on.
In your config file, you have :
 server stage.qa.example.com:11302 stage.qa.example.com check inter 2m 
fastinter 2s fall 2
It should probably be :
 server stage.qa.example.com stage.qa.example.com:11302 check inter 2m 
fastinter 2s fall 2

However this is of course no reason to segfault.
Can you please try the attached patch, and let me know if it fixes your issue ?

Thanks a lot !

Olivier
>From 5236a1a4ac19cc27c6f06d328b2df0c4cdfe220c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 1 Dec 2017 22:04:05 +0100
Subject: [PATCH] MINOR: checks: Be sure we have a mux if we created a cs.

In connect_conn_chk(), there were one case we could return with a new
conn_stream created, but no mux attached. With no mux, cs_destroy() would
segfault. Fix that by setting the mux before we can fail.

This should be backported to 1.8.
---
 src/checks.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/checks.c b/src/checks.c
index 63747201e..eaf84a225 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -1564,25 +1564,23 @@ static int connect_conn_chk(struct task *t)
conn->addr.to = s->addr;
}
 
+   proto = protocol_by_family(conn->addr.to.ss_family);
+
+   conn_prepare(conn, proto, check->xprt);
+   conn_install_mux(conn, &mux_pt_ops, cs);
+   cs_attach(cs, check, &check_conn_cb);
+   conn->target = &s->obj_type;
+
if ((conn->addr.to.ss_family == AF_INET) || (conn->addr.to.ss_family == 
AF_INET6)) {
int i = 0;
 
i = srv_check_healthcheck_port(check);
-   if (i == 0) {
-   cs->data = check;
+   if (i == 0)
return SF_ERR_CHK_PORT;
-   }
 
set_host_port(&conn->addr.to, i);
}
 
-   proto = protocol_by_family(conn->addr.to.ss_family);
-
-   conn_prepare(conn, proto, check->xprt);
-   conn_install_mux(conn, &mux_pt_ops, cs);
-   cs_attach(cs, check, &check_conn_cb);
-   conn->target = &s->obj_type;
-
/* no client address */
clear_addr(&conn->addr.from);
 
-- 
2.14.3



Re: 1.9 BUG: redispatch broken

2018-12-24 Thread Olivier Houchard
Hi Lukas,

On Sat, Dec 22, 2018 at 11:16:09PM +0100, Lukas Tribus wrote:
> Hello Oliver,
> 
> 
> redispatch is broken since commit 25b401536 ("BUG/MEDIUM: connection:
> Just make sure we closed the fd on connection failure"). It simply
> fails to connect to the next server.
> 
> 1.9 is affected.
> 
> 
> Repro:
> 
> global
> log 10.0.0.4:514 len 65535 local1 debug
> maxconn 1000
> 
> defaults
> log global
> mode http
> option httplog
> timeout connect 1s
> timeout client 30s
> timeout server 30s
> retries 3
> option redispatch 1
> 
> frontend http-in
> bind :80
> default_backend mybak
> 
> backend mybak
> mode http
> balance first
> server primary-fail 10.0.0.199:80 # this server is unreachable
> server backup-ok 10.0.0.254:80
> 
> 
> 
> 
> cheers,

Ooops you're right indeed. The attached patch should fix it.

Thanks a lot for reporting !

Regards,

Olivier
>From 2276c53dac820d0079525730e9bd7abfd3ea408c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 24 Dec 2018 13:32:13 +0100
Subject: [PATCH] BUG/MEDIUM: servers: Don't try to reuse connection if we
 switched server.

In connect_server(), don't attempt to reuse the old connection if it's
targetting a different server than the one we're supposed to access, or
we will never be able to connect to a server if the first one we tried failed.

This should be backported to 1.9.
---
 src/backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend.c b/src/backend.c
index 2407f8a32..bc38c5710 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1129,7 +1129,7 @@ int connect_server(struct stream *s)
srv_cs = objt_cs(s->si[1].end);
if (srv_cs) {
old_conn = srv_conn = cs_conn(srv_cs);
-   if (old_conn) {
+   if (old_conn && (!old_conn->target || old_conn->target == 
s->target)) {
old_conn->flags &= ~(CO_FL_ERROR | CO_FL_SOCK_RD_SH | 
CO_FL_SOCK_WR_SH);
srv_cs->flags &= ~(CS_FL_ERROR | CS_FL_EOS | 
CS_FL_REOS);
reuse = 1;
-- 
2.19.2



Re: State of 0-RTT TLS resumption with OpenSSL

2019-01-02 Thread Olivier Houchard
Hi Janusz,

On Sun, Dec 30, 2018 at 05:38:26PM +0100, Janusz Dziemidowicz wrote:
> Hi,
> I've been trying to get 0-RTT resumption working with haproxy 1.8.16
> and OpenSSL 1.1.1a.
> No matter what I put in configuration file, testing with openssl
> s_client always results in:
> Max Early Data: 0
> 
> OK, let's look at ssl_sock.c
> The only thing that seems to try to enable 0-RTT is this:
> #ifdef OPENSSL_IS_BORINGSSL
> if (allow_early)
> SSL_set_early_data_enabled(ssl, 1);
> #else
> if (!allow_early)
> SSL_set_max_early_data(ssl, 0);
> #endif
> 
> But I fail to see how this is supposed to work. OpenSSL has 0-RTT
> disabled by default. To enable this one must call
> SSL_set_max_early_data with the amount of bytes it is willing to read.
> The above simply does... nothing.
> 
> Is it supposed to work at all or do I miss something? ;)
> 

You're right indeed. 0RTT was added with a development version of OpenSSL 1.1.1,
which had a default value for max early data of 16384, but it was changed to
0 in the meanwhile.
Does the attached patch work for you ?

Thanks !

Olivier
>From cdb864da7cebb97800aef2e114bae6f0d0f96814 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 2 Jan 2019 18:46:41 +0100
Subject: [PATCH] MEDIUM: ssl: Call SSL_CTX_set_max_early_data() to enable
 0RTT.

When we want to enable early data on a listener, explicitely call
SSL_CTX_set_max_early_data(), as the default is now 0.

This should be backported to 1.8.
---
 src/ssl_sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 282b85dd..c24de955 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -3869,6 +3869,8 @@ ssl_sock_initial_ctx(struct bind_conf *bind_conf)
SSL_CTX_set_select_certificate_cb(ctx, ssl_sock_switchctx_cbk);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #elif (OPENSSL_VERSION_NUMBER >= 0x10101000L)
+   if (bind_conf->ssl_conf.early_data)
+   SSL_CTX_set_max_early_data(ctx, global.tune.bufsize - 
global.tune.maxrewrite);
SSL_CTX_set_client_hello_cb(ctx, ssl_sock_switchctx_cbk, NULL);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #else
-- 
2.14.4



Re: State of 0-RTT TLS resumption with OpenSSL

2019-01-03 Thread Olivier Houchard
Hi Janusz,

On Thu, Jan 03, 2019 at 11:49:35AM +0100, Janusz Dziemidowicz wrote:
> ??r., 2 sty 2019 o 19:04 Olivier Houchard  napisa??(a):
> > You're right indeed. 0RTT was added with a development version of OpenSSL 
> > 1.1.1,
> > which had a default value for max early data of 16384, but it was changed to
> > 0 in the meanwhile.
> > Does the attached patch work for you ?
> 
> This indeed results in following when using s_client:
> Max Early Data: 16385
> 
> However, I believe it still does not work. I was trying again to test
> it with s_client.
> 
> Without allow-0rtt option I can resume TLS 1.3 session without problem:
> openssl s_client -connect host:port -sess_out sessfile
> openssl s_client -connect host:port -sess_in sessfile
> This results with:
> Reused, TLSv1.3, Cipher is TLS_CHACHA20_POLY1305_SHA256
> 
> As soon as I add allow-0rtt (and your patch) above s_client results
> always with a new session:
> New, TLSv1.3, Cipher is TLS_CHACHA20_POLY1305_SHA256
> No matter what I do I was not able to resume any session with allow-0rtt 
> active.
> 
> Just to rule out that I am using s_client in a wrong way I've made the
> same test against s_server. I was able to successfully resume session
> and even send early data that was accepted. So I believe that there is
> still something wrong in haproxy with TLS session handling.
> 

Ah I think I figured it out.
OpenSSL added anti-replay protection when using early data, and it messes up
with the session handling.
With the updated attached patch, I get early data to work again. Is it better
for you ?

Regards,

Olivier
>From 82126322107bc628e32ff300195951fd660a43ac Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 2 Jan 2019 18:46:41 +0100
Subject: [PATCH] BUG/MEDIUM: ssl: Disable anti-replay protection and set max
 data with 0RTT.

When using early data, disable the OpenSSL anti-replay protection, and set
the max amount of early data we're ready to accept, based on the size of
buffers, or early data won't work with the released OpenSSL 1.1.1.

This should be backported to 1.8.
---
 src/ssl_sock.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 282b85dd..13ce2e5b 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -3869,6 +3869,10 @@ ssl_sock_initial_ctx(struct bind_conf *bind_conf)
SSL_CTX_set_select_certificate_cb(ctx, ssl_sock_switchctx_cbk);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #elif (OPENSSL_VERSION_NUMBER >= 0x10101000L)
+   if (bind_conf->ssl_conf.early_data) {
+   SSL_CTX_set_options(ctx, SSL_OP_NO_ANTI_REPLAY);
+   SSL_CTX_set_max_early_data(ctx, global.tune.bufsize - 
global.tune.maxrewrite);
+   }
SSL_CTX_set_client_hello_cb(ctx, ssl_sock_switchctx_cbk, NULL);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #else
-- 
2.14.4



Re: State of 0-RTT TLS resumption with OpenSSL

2019-01-04 Thread Olivier Houchard
Hi Janusz,

On Fri, Jan 04, 2019 at 10:53:51AM +0100, Janusz Dziemidowicz wrote:
> czw., 3 sty 2019 o 17:52 Olivier Houchard  napisa??(a):
> > Ah I think I figured it out.
> > OpenSSL added anti-replay protection when using early data, and it messes up
> > with the session handling.
> > With the updated attached patch, I get early data to work again. Is it 
> > better
> > for you ?
> 
> Now it works.
> However, I am a bit concerned about disabling something that sounds
> like an important safeguard.
> Reading this 
> https://www.openssl.org/docs/man1.1.1/man3/SSL_SESSION_get_max_early_data.html#REPLAY-PROTECTION
> suggests that it is really not a wise thing to do.
> 
> And again, s_server works differently. It does not use
> SSL_OP_NO_ANTI_REPLAY but the resumption, with early data, works,
> once. Then you get new session that you can resume again if you wish,
> but also once. You cannot resume the same session twice. With your
> patch I can resume single session as many times as I wish. Coupled
> with early data this is exactly something that TLS 1.3 RFC warns
> against. This probably is due to haproxy using external session
> management.
> 
> I'll try to dig more into this on weekend, now that I know where to look.
> 

I understand the concern.
I checked and both nghttp2 and nginx disable the replay protection. The idea
is you're supposed to allow early data only on harmless requests anyway, ie
ones that could be replayed with no consequence.

Regards,

Olivier



Re: State of 0-RTT TLS resumption with OpenSSL

2019-01-08 Thread Olivier Houchard
On Tue, Jan 08, 2019 at 03:00:32PM +0100, Janusz Dziemidowicz wrote:
> pt., 4 sty 2019 o 11:59 Olivier Houchard  napisa??(a):
> > I understand the concern.
> > I checked and both nghttp2 and nginx disable the replay protection. The idea
> > is you're supposed to allow early data only on harmless requests anyway, ie
> > ones that could be replayed with no consequence.
> 
> Sorry for the late reply, I was pondering the problem ;) I'm pretty ok
> with this patch, especially since others seem to do the same. And my
> use case is DNS-over-TLS, which has no problems with replays anyway ;)
> 
> However, I believe in general this is a bit more complicated. RFC 8446
> described this in detail in section 8:
> https://tools.ietf.org/html/rfc8446#section-8
> My understanding is that RFC highly recommends anti-replay with 0-RTT.
> It seems that s_server implements single use tickets, which is exactly
> what is in section 8.1. The above patch disables anti-replay
> completely in haproxy, which might warrant some updates to
> documentation about allow-0rtt option?
> 

Hi Janusz,

Yes indeed, I thought I documented it better than that, but obviously I
didn't :)
The allow-0rtt option was added before OpenSSL added anti-replay protection,
and I'm pretty sure the RFC wasn't talking about it yet, it was mostly saying
it was a security concern, so it was designed with "only allow it for what
would be safe to replay", and the documentation should certainly reflect that.
I will make it explicit.

Thanks a lot !

Olivier



Re: [PATCH] BUG/MEDIUM: init: Initialize idle_orphan_conns for first server in server-template

2019-01-09 Thread Olivier Houchard
Hi,

On Wed, Jan 09, 2019 at 01:44:08AM -0500, cripy wrote:
> Hi,
> 
> I found a segfault when using server-template within 1.9.x and 2.0-dev.
> This seems to be related to  "http-reuse" as when I set to "never" it does
> not crash anymore.
> 
> It appears that idle_orphan_conns is not being properly initialized for the
> first server within the server-template.  I was able to confirm this by
> creating a small server-template with 4 servers and setting all of the
> addresses except for the first 1.  This did not result in a crash.  As soon
> as I set and was sent to the first address it resulted in a crash.
> 
> I found that server_template_init() establishes everything fine for all
> servers (setting id from prefix with srv_set_id_from_prefix() , etc... )
> and then at the bottom of the function you can see it calls
> srv_set_id_from_prefix() to then establish the id for the first server --
> however, the first server doesn't get any of the logic to initialize the
> idle_orphan_conns.
> 
> My initial fix added the idle_orphan_conns initialization code to the
> bottom of server_template_init() (right below the srv_set_id_from_prefix()
> which sets the prefix specifically for the first server slot) -- however
> this seemed like it might be too messy.
> 
> I believe a better option is to remove the check for !srv->tmpl_info.prefix
> within server_finalize_init().  Patch attached.
> 
> Feel free to correct me if I am wrong on this assumption.
> 
> Here is the config which results in a crash:
> 
> listen fe_main
> mode http
> bind *:80
> timeout server 5ms
> timeout client 5ms
> timeout connect 5ms
> server-template srv 2 10.1.0.1:80
> 
> (Should segfault after the first request)
> 
> HA-Proxy version 2.0-dev0-251a6b-97 2019/01/08 - https://haproxy.org/
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter
> -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered
> -Wno-missing-field-initializers -Wtype-limits
>   OPTIONS = USE_OPENSSL=1
> 
> Backtrace:
> [New LWP 14046]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `./haproxy -f crash.cfg -d'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x004f82fe in srv_add_to_idle_list (conn=0x2331320,
> srv=0x22aeb60) at include/proto/server.h:244
> 244LIST_ADDQ(&srv->idle_orphan_conns[tid], &conn->list);
> (gdb) bt
> #0  0x004f82fe in srv_add_to_idle_list (conn=0x2331320,
> srv=0x22aeb60) at include/proto/server.h:244
> #1  session_free (sess=0x2330970) at src/session.c:90
> #2  0x0050dca3 in mux_pt_destroy (ctx=0x2330920) at src/mux_pt.c:38
> #3  0x00446bdb in cs_destroy (cs=0x2331230) at
> include/proto/connection.h:708
> #4  si_release_endpoint (si=si@entry=0x2330cd8) at
> include/proto/stream_interface.h:170
> #5  0x0044c9ec in stream_free (s=0x2330a40) at src/stream.c:446
> #6  process_stream (t=t@entry=0x2330e30, context=0x2330a40,
> state=) at src/stream.c:2610
> #7  0x00509955 in process_runnable_tasks () at src/task.c:432
> #8  0x0048b485 in run_poll_loop () at src/haproxy.c:2619
> #9  run_thread_poll_loop (data=data@entry=0x23267d0) at src/haproxy.c:2684
> #10 0x0040aa0c in main (argc=, argv=0x7fffd8018e48)
> at src/haproxy.c:3313
> 
> (gdb) frame 0
> #0  0x004f82fe in srv_add_to_idle_list (conn=0x2331320,
> srv=0x22aeb60) at include/proto/server.h:244
> 244LIST_ADDQ(&srv->idle_orphan_conns[tid], &conn->list);
> 
> (gdb) print &srv->idle_orphan_conns[tid]
> $1 = (struct list *) 0x0
> 
> (gdb) print &conn->list
> $2 = (struct list *) 0x2331370


Oops, that seems right, and the patch looks fine, Willy can you push it ?

Thanks a lot !

Olivier



Re: State of 0-RTT TLS resumption with OpenSSL

2019-01-09 Thread Olivier Houchard
Hi Willy,

On Tue, Jan 08, 2019 at 03:44:07PM +0100, Willy Tarreau wrote:
> On Tue, Jan 08, 2019 at 03:27:58PM +0100, Olivier Houchard wrote:
> > On Tue, Jan 08, 2019 at 03:00:32PM +0100, Janusz Dziemidowicz wrote:
> > > pt., 4 sty 2019 o 11:59 Olivier Houchard  
> > > napisa??(a):
> > > However, I believe in general this is a bit more complicated. RFC 8446
> > > described this in detail in section 8:
> > > https://tools.ietf.org/html/rfc8446#section-8
> > > My understanding is that RFC highly recommends anti-replay with 0-RTT.
> > > It seems that s_server implements single use tickets, which is exactly
> > > what is in section 8.1. The above patch disables anti-replay
> > > completely in haproxy, which might warrant some updates to
> > > documentation about allow-0rtt option?
> > > 
> > 
> > Hi Janusz,
> > 
> > Yes indeed, I thought I documented it better than that, but obviously I
> > didn't :)
> > The allow-0rtt option was added before OpenSSL added anti-replay protection,
> > and I'm pretty sure the RFC wasn't talking about it yet, it was mostly 
> > saying
> > it was a security concern, so it was designed with "only allow it for what
> > would be safe to replay", and the documentation should certainly reflect 
> > that.
> > I will make it explicit.
> > 
> > Thanks a lot !
> 
> To clear some of the confusion on this subject, when 0-RTT was designed
> in TLS 1.3, it was sort of a backport from QUIC, and it was decided that
> the replay protection was up to the application layer (here "application"
> is not in the sense of the web application you're using, but the HTTP layer
> on top of TLS). And RFC8470 (https://tools.ietf.org/html/rfc8470), which
> haproxy implements, was produced exactly to address this specific area.
> 
> Thus an HTTP implementation which properly implements RFC8470 does have
> TLS anti-replay covered by the application layer.
> 

Can you push the attached patches ?

Thanks !

Olivier
>From fe291b43e1cb9143edf0051a1d69247dde298118 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 2 Jan 2019 18:46:41 +0100
Subject: [PATCH 1/2] BUG/MEDIUM: ssl: Disable anti-replay protection and set
 max data with 0RTT.

When using early data, disable the OpenSSL anti-replay protection, and set
the max amount of early data we're ready to accept, based on the size of
buffers, or early data won't work with the released OpenSSL 1.1.1.

This should be backported to 1.8.
---
 src/ssl_sock.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 282b85dd..13ce2e5b 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -3869,6 +3869,10 @@ ssl_sock_initial_ctx(struct bind_conf *bind_conf)
SSL_CTX_set_select_certificate_cb(ctx, ssl_sock_switchctx_cbk);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #elif (OPENSSL_VERSION_NUMBER >= 0x10101000L)
+   if (bind_conf->ssl_conf.early_data) {
+   SSL_CTX_set_options(ctx, SSL_OP_NO_ANTI_REPLAY);
+   SSL_CTX_set_max_early_data(ctx, global.tune.bufsize - 
global.tune.maxrewrite);
+   }
SSL_CTX_set_client_hello_cb(ctx, ssl_sock_switchctx_cbk, NULL);
SSL_CTX_set_tlsext_servername_callback(ctx, ssl_sock_switchctx_err_cbk);
 #else
-- 
2.14.4

>From c1105e9723a2c79e3b445287586350e0d3d3e291 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Tue, 8 Jan 2019 15:35:32 +0100
Subject: [PATCH 2/2] DOC: Be a bit more explicit about allow-0rtt security
 implications.

Document a bit better than allow-0rtt can trivially be used for replay attacks,
and so should only be used when it's safe to replay a request.

This should probably be backported to 1.8 and 1.9.
---
 doc/configuration.txt | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 2447254c..888515fb 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -10768,7 +10768,10 @@ accept-proxy
 
 allow-0rtt
   Allow receiving early data when using TLSv1.3. This is disabled by default,
-  due to security considerations.
+  due to security considerations. Because it is vulnerable to replay attacks,
+  you should only allow if for requests that are safe to replay, ie requests
+  that are idempotent. You can use the "wait-for-handshake" action for any
+  request that wouldn't be safe with early data.
 
 alpn 
   This enables the TLS ALPN extension and advertises the specified protocol
-- 
2.14.4



Re: Lots of mail from email alert on 1.9.x

2019-01-11 Thread Olivier Houchard
Hi,

On Fri, Jan 11, 2019 at 10:36:04AM +0100, Johan Hendriks wrote:
> Thanks all.
> No rush on my side as it is a test machine, at least we do know when a
> backend server fails.
> With all this mail the mail server are giving alarms too :D
> 
> I disable the mail feature for now.
> 
> Thanks again Pieter, Willy and Oliver for all your work.
> 
> 
> Op 10-01-19 om 20:05 schreef PiBa-NL:
> > Hi Johan, Olivier, Willy,
> >
> > Op 10-1-2019 om 17:00 schreef Johan Hendriks:
> >> I just updated to 1.9.1 on my test system.
> >>
> >> We noticed that when a server fails we now get tons of mail, and with
> >> tons we mean a lot.
> >>
> >> After a client backend server fails we usually get 1 mail on 1.8.x now
> >> with 1.9.1 within 1 minute we have the following.
> >>
> >> mailq | grep -B2 l...@testdomain.nl | grep '^[A-F0-9]' | awk '{print
> >> $1}' | sed 's/*//' | postsuper -d -
> >> postsuper: Deleted: 19929 messages
> >>
> >> My setting from the backend part is as follows.
> >>
> >>  email-alert mailers alert-mailers
> >>  email-alert from l...@testdomain.nl
> >>  email-alert to not...@testdomain.nl
> >>  server webserver09 11.22.33.44:80 check
> >>
> >> Has something changed in 1.9.x (it was on 1.9.0 also)
> >>
> >> regards
> >> Johan Hendriks
> >>
> >>
> > Its a 'known issue' see:
> > https://www.mail-archive.com/haproxy@formilux.org/msg32290.html
> > a 'regtest' is added in that mail thread also to aid developers in
> > reproducing the issue and validating a possible fix.
> >
> > @Olivier, Willy, may i assume this mailbomb feature is 'planned' to
> > get fixed in 1.9.2 ? (perhaps a bugtracker with a 'target version'
> > would be nice ;) ?)
> >
> > Regards,
> > PiBa-NL (Pieter)
> 

Ok so erm, I'd be lying if I claimed I enjoy working on the check code, or
that I understand it fully. However, after talking with Willy and Christopher,
I think I may have comed with an acceptable solution, and the attached patch
should fix it (at least by getting haproxy to segfault, but it shouldn't
mailbomb you anymore).
Pieter, I'd be very interested to know if it still work with your setup.
It's a different way of trying to fix what you tried ot fix with
1714b9f28694d750d446917672dd59c46e16afd7 
I'd like to be sure I didn't break it for you again :)

Regards,

Olivier
>From 7e7b41cac480029c5fd93338dd86a875fee0b5a7 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 11 Jan 2019 18:17:17 +0100
Subject: [PATCH 1/2] MINOR: checks: Store the proxy in checks.

Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.

This should be backported to 1.9.
---
 include/types/checks.h |  1 +
 src/checks.c   | 32 +---
 src/server.c   |  1 +
 3 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/types/checks.h b/include/types/checks.h
index 6346fe33..f89abcba 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -180,6 +180,7 @@ struct check {
int send_string_len;/* length of agent command 
string */
char *send_string;  /* optionally send a string 
when connecting to the agent */
struct server *server;  /* back-pointer to server */
+   struct proxy *proxy;/* proxy to be used */
char **argv;/* the arguments to use if 
running a process-based check */
char **envp;/* the environment to use if 
running a process-based check */
struct pid_list *curpid;/* entry in pid_list used for 
current process-based test, or -1 if not in test */
diff --git a/src/checks.c b/src/checks.c
index 4baaf9fc..edb61b4f 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -2124,7 +2124,7 @@ static struct task *process_chk_proc(struct task *t, void 
*context, unsigned sho
 static struct task *process_chk_conn(struct task *t, void *context, unsigned 
short state)
 {
struct check *check = context;
-   struct server *s = check->server;
+   struct proxy *proxy = check->proxy;
struct conn_stream *cs = check->cs;
struct connection *conn = cs_conn(cs);
int rv;
@@ -2142,7 +2142,7 @@ static struct task *process_chk_conn(struct task *t, void 
*context, unsigned sho
 * is disabled.
 */

Re: Lots of mail from email alert on 1.9.x

2019-01-11 Thread Olivier Houchard
On Fri, Jan 11, 2019 at 06:53:11PM +0100, Olivier Houchard wrote:
> Hi,
> 
> On Fri, Jan 11, 2019 at 10:36:04AM +0100, Johan Hendriks wrote:
> > Thanks all.
> > No rush on my side as it is a test machine, at least we do know when a
> > backend server fails.
> > With all this mail the mail server are giving alarms too :D
> > 
> > I disable the mail feature for now.
> > 
> > Thanks again Pieter, Willy and Oliver for all your work.
> > 
> > 
> > Op 10-01-19 om 20:05 schreef PiBa-NL:
> > > Hi Johan, Olivier, Willy,
> > >
> > > Op 10-1-2019 om 17:00 schreef Johan Hendriks:
> > >> I just updated to 1.9.1 on my test system.
> > >>
> > >> We noticed that when a server fails we now get tons of mail, and with
> > >> tons we mean a lot.
> > >>
> > >> After a client backend server fails we usually get 1 mail on 1.8.x now
> > >> with 1.9.1 within 1 minute we have the following.
> > >>
> > >> mailq | grep -B2 l...@testdomain.nl | grep '^[A-F0-9]' | awk '{print
> > >> $1}' | sed 's/*//' | postsuper -d -
> > >> postsuper: Deleted: 19929 messages
> > >>
> > >> My setting from the backend part is as follows.
> > >>
> > >>  email-alert mailers alert-mailers
> > >>  email-alert from l...@testdomain.nl
> > >>  email-alert to not...@testdomain.nl
> > >>  server webserver09 11.22.33.44:80 check
> > >>
> > >> Has something changed in 1.9.x (it was on 1.9.0 also)
> > >>
> > >> regards
> > >> Johan Hendriks
> > >>
> > >>
> > > Its a 'known issue' see:
> > > https://www.mail-archive.com/haproxy@formilux.org/msg32290.html
> > > a 'regtest' is added in that mail thread also to aid developers in
> > > reproducing the issue and validating a possible fix.
> > >
> > > @Olivier, Willy, may i assume this mailbomb feature is 'planned' to
> > > get fixed in 1.9.2 ? (perhaps a bugtracker with a 'target version'
> > > would be nice ;) ?)
> > >
> > > Regards,
> > > PiBa-NL (Pieter)
> > 
> 
> Ok so erm, I'd be lying if I claimed I enjoy working on the check code, or
> that I understand it fully. However, after talking with Willy and Christopher,
> I think I may have comed with an acceptable solution, and the attached patch
> should fix it (at least by getting haproxy to segfault, but it shouldn't
> mailbomb you anymore).
> Pieter, I'd be very interested to know if it still work with your setup.
> It's a different way of trying to fix what you tried ot fix with
> 1714b9f28694d750d446917672dd59c46e16afd7 
> I'd like to be sure I didn't break it for you again :)
> 
> Regards,
> 
> Olivier

(Slightly modified patches, I think there were a potential race condition
when running with multiple threads).

Olivier
>From 7e7b41cac480029c5fd93338dd86a875fee0b5a7 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 11 Jan 2019 18:17:17 +0100
Subject: [PATCH 1/2] MINOR: checks: Store the proxy in checks.

Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.

This should be backported to 1.9.
---
 include/types/checks.h |  1 +
 src/checks.c   | 32 +---
 src/server.c   |  1 +
 3 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/types/checks.h b/include/types/checks.h
index 6346fe33..f89abcba 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -180,6 +180,7 @@ struct check {
int send_string_len;/* length of agent command 
string */
char *send_string;  /* optionally send a string 
when connecting to the agent */
struct server *server;  /* back-pointer to server */
+   struct proxy *proxy;/* proxy to be used */
char **argv;/* the arguments to use if 
running a process-based check */
char **envp;/* the environment to use if 
running a process-based check */
struct pid_list *curpid;/* entry in pid_list used for 
current process-based test, or -1 if not in test */
diff --git a/src/checks.c b/src/checks.c
index 4baaf9fc..edb61b4f 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -2124,7 +2124,7 @@ static struct task *process_chk_proc(struct task *t, vo

Re: Lots of mail from email alert on 1.9.x

2019-01-13 Thread Olivier Houchard
Hi,

On Sat, Jan 12, 2019 at 01:11:29PM +0100, Willy Tarreau wrote:
> Hi Pieter,
> 
> On Sat, Jan 12, 2019 at 01:00:33AM +0100, PiBa-NL wrote:
> > Thanks for this 'change in behavior' ;). Indeed the mailbomb is fixed, and
> > it seems the expected mails get generated and delivered, but a segfault also
> > happens on occasion. Not with the regtest as it was, but with a few minor
> > modifications (adding a unreachable mailserver, and giving it a little more
> > time seems to be the most reliable reproduction a.t.m.) it will crash
> > consistently after 11 seconds.. So i guess the patch needs a bit more
> > tweaking.
> > 
> > Regards,
> > PiBa-NL (Pieter)
> > 
> > Core was generated by `haproxy -d -f /tmp/vtc.37274.4b8a1a3a/h1/cfg'.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  0x00500955 in chk_report_conn_err (check=0x802616a10,
> > errno_bck=0, expired=1) at src/checks.c:689
> > 689 dns_trigger_resolution(check->server->dns_requester);
> > (gdb) bt full
> > #0  0x00500955 in chk_report_conn_err (check=0x802616a10,
> > errno_bck=0, expired=1) at src/checks.c:689
> 
> Indeed, this function should not have any special effect in this case,
> it is needed to prepend this at the beginning of chk_report_conn_err() :
> 
>   if (!check->server)
>   return;
> 
> We need to make sure that check->server is properly tested everywhere.
> With a bit of luck this one was the only remnant.
> 

I'd rather just avoid calling dns_trigger_resolution() if there's no server,
it seems it is the only use of check->server in chk_report_conn_err(), so
that set_server_check_status() is call, and the check's status and result
may be updated. Not sure it is really needed, but I'd rather not offend
the Check Gods.
The attached patches are updated to od just that.

Regards,

Olivier
>From 7e7b41cac480029c5fd93338dd86a875fee0b5a7 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Fri, 11 Jan 2019 18:17:17 +0100
Subject: [PATCH 1/2] MINOR: checks: Store the proxy in checks.

Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.

This should be backported to 1.9.
---
 include/types/checks.h |  1 +
 src/checks.c   | 32 +---
 src/server.c   |  1 +
 3 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/types/checks.h b/include/types/checks.h
index 6346fe33..f89abcba 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -180,6 +180,7 @@ struct check {
int send_string_len;/* length of agent command 
string */
char *send_string;  /* optionally send a string 
when connecting to the agent */
struct server *server;  /* back-pointer to server */
+   struct proxy *proxy;/* proxy to be used */
char **argv;/* the arguments to use if 
running a process-based check */
char **envp;/* the environment to use if 
running a process-based check */
struct pid_list *curpid;/* entry in pid_list used for 
current process-based test, or -1 if not in test */
diff --git a/src/checks.c b/src/checks.c
index 4baaf9fc..edb61b4f 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -2124,7 +2124,7 @@ static struct task *process_chk_proc(struct task *t, void 
*context, unsigned sho
 static struct task *process_chk_conn(struct task *t, void *context, unsigned 
short state)
 {
struct check *check = context;
-   struct server *s = check->server;
+   struct proxy *proxy = check->proxy;
struct conn_stream *cs = check->cs;
struct connection *conn = cs_conn(cs);
int rv;
@@ -2142,7 +2142,7 @@ static struct task *process_chk_conn(struct task *t, void 
*context, unsigned sho
 * is disabled.
 */
if (((check->state & (CHK_ST_ENABLED | CHK_ST_PAUSED)) != 
CHK_ST_ENABLED) ||
-   s->proxy->state == PR_STSTOPPED)
+   proxy->state == PR_STSTOPPED)
goto reschedule;
 
/* we'll initiate a new check */
@@ -2167,8 +2167,8 @@ static struct task *process_chk_conn(struct task *t, void 
*context, unsigned sho
 */
t->expire = tick_add(now_ms, MS_TO_TICKS(check->inter));
 
-   if (s->proxy->timeout.check && 
s->proxy->timeout.connect) {
-   int t_con = tick_add(now_ms, 

Re: Segfault in assign_tproxy_address with h2 and source address .[1.9.2]

2019-01-17 Thread Olivier Houchard
Hi Luke,

On Thu, Jan 17, 2019 at 02:35:38PM +, Luke Seelenbinder wrote:
> Hello all,
> 
> First, I wanted to say a huge thanks to the team for a producing a quality 
> piece of software. My company just moved all of our traffic over, and the 
> performance and nimbleness of haproxy is impressive. I'm testing 1.9.2 for 
> migration as soon as it's stable for our use-case.
> 
> I'm experiencing a segfault when running in mode http, option http-use-htx, 
> with h2 backends (alpn h2) and an assigned source address. A sanitized config 
> is as follows:
> 
> defaults
>source []
>mode http
>http-reuse always
>option http-use-htx
> 
> listen test
>bind :443 ssl crt  alpn h2,http/1.1
>server backend ipv6@ check ssl crt  ca-file 
>  verifyhost  alpn h2,http/1.1 check-alpn http/1.1
> 
> If I disable h2 on the backend, it works correctly. If I disable the source 
> in defaults, it works correctly. I've attached the backtrace below.
> 
> Best,
> Luke
> 

I think I understand what's going on.
Does the attached patch fix it for you ?

Thanks a lot !

Olivier
>From 585bcc7f8ec84573d070d2d9d1e0b104fd18eb48 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 17 Jan 2019 15:59:13 +0100
Subject: [PATCH] BUG/MEDIUM: servers: Make assign_tproxy_address work when
 ALPN is set.

If an ALPN is set on the server line, then when we reach assign_tproxy_address,
the stream_interface's endpoint will be a connection, not a conn_stream,
so make sure assign_tproxy_address() handles both cases.

This should be backported to 1.9.
---
 src/backend.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend.c b/src/backend.c
index ab5a629c..3e327975 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1021,7 +1021,12 @@ static void assign_tproxy_address(struct stream *s)
struct server *srv = objt_server(s->target);
struct conn_src *src;
struct connection *cli_conn;
-   struct connection *srv_conn = cs_conn(objt_cs(s->si[1].end));
+   struct connection *srv_conn;
+
+   if (objt_cs(s->si[1].end))
+   srv_conn = cs_conn(__objt_cs(s->si[1].end));
+   else
+   srv_conn = objt_conn(s->si[1].end);
 
if (srv && srv->conn_src.opts & CO_SRC_BIND)
src = &srv->conn_src;
-- 
2.19.1



Re: Segfault in assign_tproxy_address with h2 and source address .[1.9.2]

2019-01-17 Thread Olivier Houchard
On Thu, Jan 17, 2019 at 03:09:20PM +, Luke Seelenbinder wrote:
> Hi Oliver,
> 
> Yes! I can confirm the patch does indeed work—thanks for the quick turnaround.
> 
> Best,
> Luke
> 

Willy, can you push it ?

Thanks !

Olivier
>From 585bcc7f8ec84573d070d2d9d1e0b104fd18eb48 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 17 Jan 2019 15:59:13 +0100
Subject: [PATCH] BUG/MEDIUM: servers: Make assign_tproxy_address work when
 ALPN is set.

If an ALPN is set on the server line, then when we reach assign_tproxy_address,
the stream_interface's endpoint will be a connection, not a conn_stream,
so make sure assign_tproxy_address() handles both cases.

This should be backported to 1.9.
---
 src/backend.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend.c b/src/backend.c
index ab5a629c..3e327975 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1021,7 +1021,12 @@ static void assign_tproxy_address(struct stream *s)
struct server *srv = objt_server(s->target);
struct conn_src *src;
struct connection *cli_conn;
-   struct connection *srv_conn = cs_conn(objt_cs(s->si[1].end));
+   struct connection *srv_conn;
+
+   if (objt_cs(s->si[1].end))
+   srv_conn = cs_conn(__objt_cs(s->si[1].end));
+   else
+   srv_conn = objt_conn(s->si[1].end);
 
if (srv && srv->conn_src.opts & CO_SRC_BIND)
src = &srv->conn_src;
-- 
2.19.1



Re: Compilation fails on OS-X

2019-02-13 Thread Olivier Houchard
Hi Patrick,

On Wed, Feb 13, 2019 at 10:01:01AM -0500, Patrick Hemmer wrote:
> 
> 
> On 2019/2/13 09:40, Aleksandar Lazic wrote:
> > Am 13.02.2019 um 14:45 schrieb Patrick Hemmer:
> >> Trying to compile haproxy on my local machine for testing purposes and am
> >> running into the following:
> > Which compiler do you use?
> 
>   # gcc -v
>   Configured with: 
> --prefix=/Applications/Xcode.app/Contents/Developer/usr 
> --with-gxx-include-dir=/usr/include/c++/4.2.1
>   Apple LLVM version 9.0.0 (clang-900.0.39.2)
>   Target: x86_64-apple-darwin17.7.0
>   Thread model: posix
>   InstalledDir: 
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
> 
> >> # make TARGET=osx
> >> src/proto_http.c:293:1: error: argument to 'section' attribute is 
> >> not
> >> valid for this target: mach-o section specifier requires a segment and 
> >> section
> >> separated by a comma
> >> DECLARE_POOL(pool_head_http_txn, "http_txn", sizeof(struct 
> >> http_txn));
> >> ^
> >> include/common/memory.h:128:2: note: expanded from macro 
> >> 'DECLARE_POOL'
> >> REGISTER_POOL(&ptr, name, size)
> >> ^
> >> include/common/memory.h:123:2: note: expanded from macro 
> >> 'REGISTER_POOL'
> >> INITCALL3(STG_POOL, create_pool_callback, (ptr), 
> >> (name),
> >> (size))
> >> ^
> >> include/common/initcall.h:102:2: note: expanded from macro 
> >> 'INITCALL3'
> >> _DECLARE_INITCALL(stage, __LINE__, function, arg1, 
> >> arg2,
> >> arg3)
> >> ^
> >> include/common/initcall.h:78:2: note: expanded from macro
> >> '_DECLARE_INITCALL'
> >> __DECLARE_INITCALL(__VA_ARGS__)
> >> ^
> >> include/common/initcall.h:65:42: note: expanded from macro
> >> '__DECLARE_INITCALL'
> >>
> >> __attribute__((__used__,__section__("init_"#stg))) =   \
> >>
> >>
> >>
> >> Issue occurs on master, and the 1.9 branch
> >>
> >> -Patrick
> 

Does the (totally untested, because I have no Mac to test) patch works for
you ?

Thanks !

Olivier
>From 2f68108ae32a66782b4c7518f8830896732be64d Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 13 Feb 2019 16:22:17 +0100
Subject: [PATCH] BUILD/MEDIUM: initcall: Fix build on MacOS.

MacOS syntax for sections is a bit different, so implement it.

This should be backported to 1.9.
---
 include/common/initcall.h | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/common/initcall.h b/include/common/initcall.h
index 35d237fd..95e488ba 100644
--- a/include/common/initcall.h
+++ b/include/common/initcall.h
@@ -50,6 +50,12 @@ struct initcall {
void *arg3;
 };
 
+#ifdef __APPLE__
+#define HA_SECTION(s) __section__("__DATA, " s)
+#else
+#define HA_SECTION(s) __section__((s))
+#endif
+
 /* Declare a static variable in the init section dedicated to stage ,
  * with an element referencing function  and arguments .
  *  is needed to deduplicate entries created from a same file. The
@@ -62,7 +68,7 @@ struct initcall {
  */
 #define __DECLARE_INITCALL(stg, linenum, function, a1, a2, a3) \
 static const struct initcall *__initcb_##linenum   \
-   __attribute__((__used__,__section__("init_"#stg))) =   \
+   __attribute__((__used__,HA_SECTION("init_"#stg))) =\
(stg < STG_SIZE) ? &(const struct initcall) {  \
.fct = (void (*)(void *,void *,void *))function,   \
.arg1 = (void *)(a1),  \
@@ -113,9 +119,16 @@ struct initcall {
  * empty. The corresponding sections must contain exclusively pointers to
  * make sure each location may safely be visited by incrementing a pointer.
  */
+#ifdef __APPLE__
+#define DECLARE_INIT_SECTION(stg)  
 \
+   extern __attribute__((__weak__)) const struct initcall 
*__start_init_##stg __asm("section$start$__DATA$" stg); \
+   extern __attribute__((__weak__)) const struct initcall 
*__stop_init_##stg __asm("section$end$__DATA$" stg)
+
+#else
 #define DECLARE_INIT_SECTION(stg)  
 \
extern __attribute__((__weak__)) const struct initcall 
*__start_init_##stg; \
extern __attribute__((__weak__)) const struct initcall 
*__stop_init_##stg
+#endif
 
 /* Declare all initcall sections here */
 DECLARE_INIT_SECTION(STG_PREPARE);
-- 
2.20.1



  1   2   3   >