Re: How to configure DH groups for TLS 1.3

2024-05-07 Thread Tristan

Hi Dominik,


On Thu, 2 May 2024 at 17:14, Froehlich, Dominik
 wrote:

The closest I’ve gotten is the “curves” property: 
https://docs.haproxy.org/2.8/configuration.html#5.1-curves

However, I think it only restricts the available elliptic curves in a ECDHE 
handshake, but it does not prevent a TLS 1.3 client from selecting a non-ECDHE 
prime group, for example “ffdhe8192”.

[snip]
While Lukas answered the specific question better than I could, does the 
hardening guide you're following happen to be a public resource in general?


Good public guidelines on the topic is very sparse [1], and I'd be 
interested in these if they exist somewhere, if only out of curiosity.


Regards,
Tristan

[1]: Or often essentially nonexistent, short of reading dozens of papers 
off arxiv, of which the majority seem to focus on PoCs rather than 
practical advice




Re: Changes in HAProxy 3.0's Makefile and build options

2024-04-12 Thread Tristan
Hi Willy,

> On 11 Apr 2024, at 18:18, Willy Tarreau  wrote:
> 
> Some distros simply found that stuffing their regular CFLAGS into
> DEBUG_CFLAGS or CPU_CFLAGS does the trick most of the time. Others use
> other combinations depending on the tricks they figured.

Good to know I wasn’t alone scratching my head about what came from where!

> After this analysis, I figured that we needed to simplify all this, get
> back to what is more natural to use or more commonly found, without
> breaking everything for everyone. […]

These are very nice improvements indeed, but I admit that if (at least 
partially) breaking backwards compatibility was the acceptable choice here, I’d 
have hoped to see something like cmake rather than a makefile refactoring.

Actually, I’d thought a few times in the past about starting a discussion in 
that direction but figured it would be inconceivable.

I don’t know how controversial it is, so the main two reasons I mention it are:
- generally better tooling (and IDE support) out of the box: options/flags 
discovery and override specifically tends to be quite a bit simpler as the 
boilerplate and conventions are mostly handled by default
- easier to parse final result: both of them look frankly awful, but the result 
of cmake setups is often a little easier to parse as it encourages a rather 
declarative style (can be done in gmake, but it is very much up to the 
maintainers to be extremely disciplined about it)

Arguably, there’s the downside of requiring an extra tool everyone might not be 
deeply familiar with already, and cmake versions matter more than gmake ones so 
I would worry about compatibility for distros of the GCC 4 era, but it seemed 
to me like it’s reasonably proven and spread by now to consider.

Tristan



Re: [RFC] Allow disabling abstract unix socket paths NUL-padding

2024-03-09 Thread Tristan




On 09/03/2024 18:09, Tristan wrote:

Hi Willy,

On 09/03/2024 16:51, Willy Tarreau wrote:

Hi Tristan,

On Sat, Mar 09, 2024 at 04:20:21PM +, Tristan wrote:

To be honest, I don't think this is unfixable. It's just a matter of how
much code change we think is acceptable for it.


I don't mind about the amount of changes. "we've always done it like 
this"

is never a valid excuse to keep code as it is, especially when it's wrong
and causes difficulties to add new features. Of course I prefer invasive
changes early in the dev cycle than late but we're still OK and I don't
expect that many changes for this anyway.


Agreed on not shying away from amount of changes.

I tried to follow that route, being generous in what I was willing to 
touch, but ran into the issue of protocols and socket families being a 
1:1 binding.


So adding a new socket family *requires* adding a new protocol too, 
which isn't too hard but spirals a bit out of control:
- protocol_by_family is a misnomer, and it is actually protocols *by 
socket domain* (which is still AF_UNIX, even with family AF_CUST_ABNS2, 
and ends up overriding standard socket/abns protocol)
- but you can't just change that, because then other things stop working 
as some places rely on socket domain != socket family


Actually, following up on this, I'm more and more convinced that it 
actually makes things somewhat worse to take this approach...


At least I don't see a good way to make it work without looking very 
bad, because in a few places we lookup a protocol based on a 
socket_storage reference, so we have truly only the socket domain to 
work with if no connection is established yet.


While it sounds like shying away from deeper changes, it seems to me 
that a bind/server option is truly the right way to go here.


Tristan



Re: [RFC] Allow disabling abstract unix socket paths NUL-padding

2024-03-09 Thread Tristan

Hi Willy,

On 09/03/2024 16:51, Willy Tarreau wrote:

Hi Tristan,

On Sat, Mar 09, 2024 at 04:20:21PM +, Tristan wrote:

To be honest, I don't think this is unfixable. It's just a matter of how
much code change we think is acceptable for it.


I don't mind about the amount of changes. "we've always done it like this"
is never a valid excuse to keep code as it is, especially when it's wrong
and causes difficulties to add new features. Of course I prefer invasive
changes early in the dev cycle than late but we're still OK and I don't
expect that many changes for this anyway.


Agreed on not shying away from amount of changes.

I tried to follow that route, being generous in what I was willing to 
touch, but ran into the issue of protocols and socket families being a 
1:1 binding.


So adding a new socket family *requires* adding a new protocol too, 
which isn't too hard but spirals a bit out of control:
- protocol_by_family is a misnomer, and it is actually protocols *by 
socket domain* (which is still AF_UNIX, even with family AF_CUST_ABNS2, 
and ends up overriding standard socket/abns protocol)
- but you can't just change that, because then other things stop working 
as some places rely on socket domain != socket family



If push comes to shove, one
can always add an extra field somewhere, or an extra arg in get_addr_len,
even if it'd be a little ugly.


Actually you're totally right and you make me realize that there's an
addrcmp field in the address families, and it's inconsistent to have
get_addr_len() being family-agnostic. So most likely we should first
change get_addr_len() to be per-family and appear in the proto_fam
struct. The few places where get_addr_len() is used should instead
rely on the protocol family for this. And due to this new address
having a variable length, we should support passing a pointer to the
address (like the current get_addr_len() does) in addition to the
protocol pointer.


I tried to do that yes, changing sock_addrlen to be a function pointer.
But it's actually less convenient, surprisingly, since then you end up 
with constructs like:


  listener->rx/tx->ss->proto->fam->get_addrlen(listener->rx/tx->ss)

(paraphrasing from memory; point is that it becomes somewhat unpleasant 
to look at)


Still might end up having to do that given how tricky things become 
otherwise...



Doing so would cleanly unlock all the problem. The new abns2 family
would just bring its own get_addr_len() variant that uses strlen().


Agreed on that OOP-ish approach being in principle a bit nicer though 
(sorry for the swear word ;-) )


Tristan



Re: [RFC] Allow disabling abstract unix socket paths NUL-padding

2024-03-09 Thread Tristan

Hi Willy,

On 08/03/2024 18:01, Willy Tarreau wrote:

The problem with default values is that a single behavior cannot be deduced
from reading a single config statement. That's quite painful for lots of
people (including those who copy config blocks from stackoverflow), and
for API tools. And it works half of the time because internally both modes
work but they can't interoperate with other tools. Here we're really
indicating a new address format. There's nothing wrong with that and we
do have the tools to handle that.

I think that the plan should be:

   - add AF_CUST_ABNS2 to protocol-t.h before AF_CUST_MAX
   - add to str2sa_range() the case for "abns2" just after "abns" with
 address family "AF_CUST_ABNS2", and duplicate the test for
 ss.ss_family == AF_UNIX later for AF_CUST_ABNS2 removing all the
 stuff related to !abstract, or instead, just extend the "if" condition
 to the new family and add that case there as well (might be even easier).
   - duplicate "proto_fam_unix" to "proto_fam_abns2" in sock_unix.c, referencing
 "sock_abns2_addrcmp" for the address comparison
   - duplicate sock_unix_addrcmp() to sock_abns2_addrcmp() and implement only
 the abns case, basically:

if (a->ss_family != b->ss_family)
return -1;

if (a->ss_family != AF_CUST_ABNS2)
return -1;

if (au->sun_path[0])
return -1;

return memcmp(au->sun_path, bu->sun_path, sizeof(au->sun_path));
   - in sock_unix_bind_receiver(), add a case for this address family in
 the bind() size argument, replacing sizeof(addr) with the string
 length when running AF_CUST_ABNS2.

   - for get_addr_len() I need to think :-/  I actually don't know how
 that one works with socketpairs already, so there might be a trick
 I'm overlooking.


Alright. I will look at how that looks and report back shortly.


I need to leave now, but I continue to think that if there is any internal
shortcoming, we should not try to work around it at the expense of trickier
long-term maintenance, instead we should address it. And don't worry, I'm
not going to assign you the task of dealing with limited stuff. We may end
up with the workaround in the worst case if we don't find better, but that
would mean declaring a failure and having to break that stuff sometime in
the future, which is always a pain.


To be honest, I don't think this is unfixable. It's just a matter of how 
much code change we think is acceptable for it. If push comes to shove, 
one can always add an extra field somewhere, or an extra arg in 
get_addr_len, even if it'd be a little ugly.


Tristan



Re: [RFC] Allow disabling abstract unix socket paths NUL-padding

2024-03-08 Thread Tristan

Hi Willy,

On 08/03/2024 17:05, Willy Tarreau wrote:

We could just have "abns2" and declare that it's the second version of the
abns format and know that this one is interoperable with a number of other
components.


Having abns@ and abns2@ strictly for that one difference seems like a 
lot to me, to be honest, but I get the argument that they are otherwise 
fundamentally incompatible... Hmm...



It would even be easier to deal with for those who document
such components because their integration docs would just indicate that
the address has to be placed after "abns2@" in haproxy, instead of saying
"and don't forget to adjust this global setting in your config, unless you
already use abns anywhere else".


Considering there's barely any documentation on the internet as a whole 
about abstract unix sockets, I doubt we should worry too much about that 
bit.

Also the target audience for those is unlikely to be too deterred.


Indeed, right now it seems that your patch
does nothing for the receivers, bind() still uses sizeof(), and the
sock_unix_addrcmp() function continues to compare the full name, making
such sockets non-transferable across reloads. So it's unclear to me how
this can work in your test :-/


Maybe I'm missing some place/case, but at least for trivial binds it 
certainly works; which explains why the test passes :-)


// without tightsocklen
$ netstat -l | grep @haproxy
unix  2  [ ACC ] STREAM LISTENING 934838 
@haproxy-f1@


// with tightsocklen
$ netstat -l | grep @haproxy
unix  2  [ ACC ] STREAM LISTENING 956878   @haproxy-f1


Thus I think it would be preferable to go that route rather than having
to deal with all exceptions everywhere based on a global-only option.
We can help on this, of course.


Well I do want it so I'm fine with doing it. But I wonder if it's not a 
bit of a:
- abns@ eventually not used by anyone as everything keeps on moving away 
from padding

- abns2@ eventually the default, which is quite jarring I suppose

If it's easier that way, I'd prefer making it a proper bind/server 
option, which allows for easier transition, and can just change default 
value down the line if necessary?


Thanks,
Tristan



[RFC] Allow disabling abstract unix socket paths NUL-padding

2024-03-05 Thread Tristan

Hello,

Earlier, I ran into the issue outlined in 
https://github.com/haproxy/haproxy/issues/977


Namely, that HAProxy will NUL-pad (as suffix) abstract unix socket 
paths, causing interop issues with other programs.
I raised a new feature request to make this behaviour tunable 
(https://github.com/haproxy/haproxy/issues/2479).


Since I do quite want this, I also came up with a patch for it, attached 
to this email.


Marked as RFC for now because there hasn't really been a broad 
discussion on the topic yet.


Thanks,
Tristan
From 24962900bd6e537cdc2fd293359d2092c93ff683 Mon Sep 17 00:00:00 2001
From: Tristan 
Date: Wed, 6 Mar 2024 06:00:14 +
Subject: [PATCH] MINOR: socket: Allow disabling abstract unix socket paths
 NUL-padding

When an abstract unix socket is bound by HAProxy, NUL bytes are
appended at the end of its path until sun_path is filled (for a
total of 108 characters).

Here we add an option to pass only the non-NUL
length of that path to connect/bind calls, such that
the effective path of the socket's name is as humanly written.

This is achieved with the unix-bind tightsocklen { on | off }
global directive option, and mimicks socat's unix-tightsocklen
in its semantics.

Fixes GH issues #977 and #2479

This probably doesn't need backporting.
---
 doc/configuration.txt  | 34 ---
 include/haproxy/global-t.h |  1 +
 include/haproxy/tools.h|  9 ++
 reg-tests/server/abns_tightsocklen.vtc | 38 ++
 src/cfgparse-global.c  | 23 +++-
 src/sock_unix.c|  2 +-
 src/tools.c|  2 +-
 7 files changed, 96 insertions(+), 13 deletions(-)
 create mode 100644 reg-tests/server/abns_tightsocklen.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 787103fe0..5d755741f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -2668,18 +2668,30 @@ ulimit-n 
   See also: fd-hard-limit, maxconn
 
 unix-bind [ prefix  ] [ mode  ] [ user  ] [ uid  ]
-  [ group  ] [ gid  ]
+  [ group  ] [ gid  ] [ tightsocklen { on | off } ]
 
   Fixes common settings to UNIX listening sockets declared in "bind" statements.
   This is mainly used to simplify declaration of those UNIX sockets and reduce
   the risk of errors, since those settings are most commonly required but are
-  also process-specific. The  setting can be used to force all socket
-  path to be relative to that directory. This might be needed to access another
-  component's chroot. Note that those paths are resolved before HAProxy chroots
-  itself, so they are absolute. The , , ,  and 
-  all have the same meaning as their homonyms used by the "bind" statement. If
-  both are specified, the "bind" statement has priority, meaning that the
-  "unix-bind" settings may be seen as process-wide default settings.
+  also process-specific.
+  
+  The  setting can be used to force all socket path to be relative to
+  that directory. This might be needed to access another component's chroot. Note
+  that those paths are resolved before HAProxy chroots itself, so they are absolute.
+  
+  The , , ,  and  all have the same meaning as their
+  homonyms used by the "bind" statement. If both are specified, the "bind" statement
+  has priority, meaning that the "unix-bind" settings may be seen as process-wide
+  default settings.
+
+  The tightsocklen argument controls the behaviour of the socket path when binding
+  abstract unix domain sockets. The address of a socket (sun_path) is by convention
+  108-characters long on UNIX, but it is not mandatory, and shorter paths usually
+  bind successfully. When set to 'off' (the default), trailing NUL bytes are added
+  to the connect()/bind() syscalls to satisfy this convention. When set to 'on',
+  nothing is appended to the path even if the result is shorter than 108 characters,
+  and HAProxy attempts to bind it as-is. This setting effectively behaves like
+  socat's unix-tightsocklen option.
 
 unsetenv [ ...]
   Removes environment variables specified in arguments. This can be useful to
@@ -5502,7 +5514,8 @@ bind / [, ...] [param*]
 sun_path length is used for the address length. Some other programs
 such as socat use the string length only by default. Pass the option
 ",unix-tightsocklen=0" to any abstract socket definition in socat to
-make it compatible with HAProxy's.
+make it compatible with HAProxy's, or enable the same behaviour in
+HAProxy via the unix-bind directive in the global section.
 
   See also : "source", "option forwardfor", "unix-bind" and the PROXY protocol
  documentation, and section 5 about bind options.
@@ -10805,7 +10818,8 @@ server  [:[port]] [param*]
 sun_path length is used for the address length. Some other progr

Re: WolfSSL builds for use with HAProxy

2024-02-09 Thread Tristan

Hi Ilya,

On 09/02/2024 20:31, Илья Шипицин wrote:
I run QUIC Interop from time to time, WolfSSL shows the best 
compatibility compared to LibreSSL and aws-lc.

it really looks like a winner today


And in comparison to current QuicTLS?


I'm afraid it practice it works in a different way.
First, you install WolfSSL to prod, next INSTALL/wiki got updated :)


I appreciate the joke... :) but more seriously I am not very 
knowledgeable when it comes to low-level programming or the associated 
tuning/performance-testing.
So even if I deployed it, my opinion on that topic is unlikely to be the 
best (besides bug reports anyway).


Either way, for now I'm waiting on OCSP support first (hi William, 
Rémi); hopefully someone else figures out the best build flags by the 
time that's dealt with.


Tristan



Re: Managing haproxy configs at scale

2024-02-08 Thread Tristan

Hi Zach,

(Replying on the ML since your reply was directly to me only)

On 08/02/2024 21:10, Zach Pearson wrote:

Thanks for the reply, Tristan, and sorry for the late reply!
To add a little more context for our haproxy setup, we currently deploy haproxy 
along with a sidecar control-plane that programs dynamic data such as backends. 
We also have a sidecar application that we use to offload more complex logic 
via SPOE. Inside of haproxy, we also use a few lua converters and fetchers as 
well.


Seems like you already have as good a setup as it gets for that side of 
things then! I'm jealous :-)



I think breaking up our features into their own jinja files will help a lot! As 
we expand the environments we are deployed in this will help us better control 
the features in each environment.


Nice if that helps you out at least a bit!


Beyond managing our configs, one thing we are trying to handle nicely is 
multiple flows such as guest traffic vs authenticated traffic; for guest 
traffic we do stricter anti-abuse/bot checking, for example.
This leads to a case where we have large blocks of configs that are all 
applying some acls, which we’ve found can be difficult to manage and easy to 
make mistakes.
It almost feels like we need some sort of control flow or logical blocks that exist between the frontend and backend. > We are considering doing this via multiple frontends and having our 
public frontend choose which flow the traffic should go through and 
loopback to the appropriate frontend. Of course, this will incur some 
latency overhead, though.

If you have any advice or experience in this area that would be greatly 
appreciated!

I can certainly relate to this issue, and for very similar reasons.

Alas, I have not found any better solution so far. And the looping 
approach always felt wrong to me, so I have kept on abusing txvars and 
so on to avoid it... which is not very fun indeed.


For what it's worth, we discussed tangential topics, in October, mainly 
in the 2.9-dev7 announce thread if you missed it 
(https://www.mail-archive.com/haproxy@formilux.org/msg44121.html)


Ended up a mix of both sides of large-config issues:
- dpapi, socket api, XDS, ... for dynamic management of runtime config
- extensibility issues for complex configs, notably with regards to 
SPOE, filters, alternatives to LUA, ... for "this"


It's a bit of a messy thread and there's a lot to read, but I think the 
portion of the thread that starts here is most relevant: 
https://www.mail-archive.com/haproxy@formilux.org/msg44122.html


Aleksandar made a follow-up thread about filters here 
https://www.mail-archive.com/haproxy@formilux.org/msg44164.html, as they 
are one of the best extension points currently, but it was just him and 
me in the end.


That said, I'm sure us 3 are not the only ones trying to -- as Willy 
summarizes it well -- "put too much of the application into the load 
balancer".


So if we want to see improvements for that use-case (essentially HAProxy 
as a programmable edge, rather than "just" a load balancer / reverse 
proxy) it'd be quite helpful to hear details/opinions/suggestion from 
you and others on the matter. (and thus I end up asking rather than 
answering, sorry :)


Thanks again! 
Zach


Thanks,
Tristan



WolfSSL builds for use with HAProxy

2024-02-08 Thread Tristan

Hi all,

With the ever-increasing threat of one day needing to give up on OpenSSL 
1.1.1 (whenever the next bad CVE is found on QuicTLS 1.1.1w, 
essentially) I was looking at alternatives a bit closer.


Based on the wiki, 
https://github.com/openssl/openssl/issues/20286#issuecomment-1527869072, 
and that it has support for other features I'm interested in (notably 
ECH), WolfSSL seems by far my best bet at the moment.


However, given that almost everything is compile time and defaults focus 
on suitability for constrained embedded environments rather than best 
big-proxy-server oriented performance, does anyone have pointers on what 
flags are important/traps/etc?


Besides the getrandom thing, HAProxy's INSTALL/wiki only vaguely mention 
that such build-time tuning is required, so I'm hoping someone might 
have gone through that already.


This one is a bit extra, but considering that aiming for bleeding edge 
with WolfSSL is not entirely compatible with how most distros work (ie 
even if it was packaged, it's likely going to be perpetually quite far 
behind), what does the future look like in that regard from the distros' 
side?


Thanks,
Tristan



Re: ACL and operator

2024-02-03 Thread Tristan



> On 3 Feb 2024, at 15:18, Willy Tarreau  wrote:
> 
> Quite honestly, we've though about it several times but you can't enforce
> such a change on 20 years of configs everywhere.

That is why I directly mentioned it being some form of opt-in behavior, because 
indeed we can’t expect everyone to update their config (or even agree with the 
change at all).

> Also that
> complicates concatenation and mixing of multiple modes (single/double quotes
> for example).

I think forcing only one would be okay if that was necessary for it, but I 
don’t quite see why?
As I imagine it, it’d mostly be adding an indirection step when consuming 
individual string/regex, checking the strictness flag, and rejecting at parse 
time if not well-quoted.
But still operate 1 string at a time, so with consistent quote style per 
invocation.

> And after all, in shell, strings do not need to be quoted and
> when this becomes a difficulty, users just learn to put quotes to visually
> isolate their words.

Frankly, shell (and even bash) are not very good in that regard. Even quite bad 
I think.

They are incredibly useful, and pipe chaining is one of the brightest ideas in 
computing I think, but its string handling is… not good.

One super common example is: who ever remembers things like when to use $*, $@, 
or "$@" or not, and for what kind of argument splitting?

I certainly still have to look it up almost everytime after years and years, 
and I don’t believe I will remember them better in another 10 years either.

> That's exactly the same in shell, you never know how many backslashes you
> have to insert inside quotes to put other quotes.

Indeed. And I’d much prefer to be unable to forget quotes without an error. If 
I could do that with a flag, I would enable it in shell scripts and yaml files 
for sure. Just like I never write any shell script without set -euo pipefail 
anymore.

It’s for the same reasons as using zero-warning in haproxy, -Wall in all 
compilers and languages that support it, and so on:
1. I am a dumb human that makes dumb mistakes all the time
2. I have spent way more many hours staring at my screen in confusion until I 
noticed a typo was at fault than any other kind of bug, by far

>>> It does already exist, with -dW. In fact the diag mode enables emission
> of more warnings than the default one, and purposely takes the risk of
> emitting false positives which the user has to sort out.

Nice then! I wasn’t sure if they were treated the same as non-diag warnings in 
that regard but that’s perfect if it does.

> I checked here, and among 435 test configs that emit zero warnings, only
> 15 are caught by -dD, essentially due to "weight 0" on a server line, or
> due to a "global" section placed after another section (typically after
> a "ring" one for traces).

Funny you mention it, I’m pretty sure I’m guilty of the ring/global ordering 
one. 
I will check that later.

Tristan



Re: ACL and operator

2024-02-03 Thread Tristan
Hi Willy,

> On 3 Feb 2024, at 12:48, Willy Tarreau  wrote:

> in fact we could check for
>> the presence of "and" or "or" on a line, or some other suspicious
>> constructs

Another approach might be to optionally enforce quotes around strings.

While it’d be a breaking change and sounds a bit difficult to offer as a flag, 
it avoids a lot of the current traps I think.
Though I can imagine that many wouldn’t like that (hence my pointing out to a 
flag or something).

Essentially, the current state of things reminds me a bit of some of the 
error-prone-ness of YAML in that regard (implicit strings).
Similarly I can never remember the cases where it’s string, or "string", or 
str(string), or if quoting a string will maybe include the quotes in the actual 
value, and which ones are equivalent

> Well, this one was took just about the time needed for coffee to start to
> make its effect :-)
> 
> How about this:
[…]
> 
> It gives me no warning but triggers many diag warnings:
> 
>  $ ./haproxy -c -f diag.cfg
>  $ ./haproxy -dD -c -f diag.cfg
>  [NOTICE]   (23315) : haproxy version is 3.0-dev2-d480b7-58
>  [NOTICE]   (23315) : path to executable is ./haproxy
>  [DIAG] (23315) : config : parsing [diag.cfg:15] : pattern '//' looks 
> like a failed attempt at commenting an end of line
>  [DIAG] (23315) : config : parsing [diag.cfg:16] : pattern 'and' looks 
> like a failed attempt at using an operator inside a pattern list
>  […]

That already sounds very helpful. I haven’t checked locally but adding an 
option similar to zero-warning for diagnostics warnings could be nice on top of 
it; depends on how many FPs happen in practice

Tristan



Re: Managing haproxy configs at scale

2024-01-23 Thread Tristan

On 23/01/2024 22:44, Zach Pearson wrote:

Hello,

We are using haproxy in a few different environments/functions and the 
configurations are getting complicated to manage. We are already using 
Jinja2 to generate environment specific configs but was wondering if 
anyone has suggestions on tools or processes we can adopt to help us 
manage our configs.


Not sure how much you need to handle, but here's how we do it.
Our config is still in the "hand-written is ok" range, but with enough 
to be really awful via a bunch of "use_backend if" (8 frontends, 70 
backends):
- put your path/domain+path => backend matches in 2 dedicated files to 
be used as a HAProxy map

- use map_ matches to reference it and do most of the routing
- then keep frontends entirely about sanitizing inputs
- and service-specific config purely inside the relevant backends, which 
will be their own file with jinja includes


Ends up looking something like that (for us):

# pathonly.map
## things you want to match no matter the domain
/favicon generic_favicon
...

# hostpath.map
## things you match with domain+path
domain1#/   backend1
domain1#/someroute/ backend2
...
domain2#/   backend3
...etc

# haproxy.cfg
global
...

{{ include "frontend1.j2" }}
{{ include "frontend2.j2" }}
...

{{ include "backend1.j2" }}
{{ include "backend2.j2" }}

# frontend_common.j2
acl backend_match var(txn.selected_backend) -m found

... initial sanitizing etc ...

# (if sanitizing step didn't select a backend already, eg IP ban 
handling)

http-request set-var(txn.route_path) path,lower if !backend_match
http-request set-var(txn.route_dompath) 
hdr(host),concat(\#,txn.route_path) if !backend_match


# try path-only match first
http-request set-var(txn.selected_backend) 
var(txn.route_path),map_beg(/path/to/pathonly.map) if !backend_match


# otherwise try domain+path
http-request set-var(txn.selected_backend) 
var(txn.route_dompath),map_beg(/path/do/dompath.map) if !backend_match


use_backend %[var(txn.selected_backend,generic_404)]

# frontend1.j2
frontend frontend1
  bind ...
  {{ include "frontend_common.j2" }}

One good result is that I was quite surprised by how many 
frontends+backends we actually have. One bad result is that 
includes+macros make finding the source of a bad config a real pain 
sometimes.


Also it's not super friendly to people discovering HAProxy, nor is it 
the most readable, but you already lost the readability fight after the 
25th line of "use_backend foo-bar if { hdr(Host) -i foo.domain } { 
path_beg -i /bar }" anyway.


However, if you're already past the humanly-manageable step (or have too 
many humans needing to cooperate on proxy configs), I see no good 
solution other than the service registry model, similar to how ingress 
controllers are implemented using k8s APIs:
- a program calls an API of yours (ICs call k8s, you have to make/find 
your own)

- which returns sets of frontend, routes, certs
- and their matching backends
- your program then updates HAProxy via the dataplane/socket API to match

Assuming upstream teams can then self-register their service instances 
on that API, it's mostly all dealt with for you.


As far as I've seen in the open:
- on the registry API side, Consul was essentially designed for this
- on the API<>HAProxy sync side, the haproxytech ingress controller 
implementation could probably give you ideas (or be partially repurposed 
for non-k8s use, which I said I'd open a GH issue for ages ago, and then 
forgot to do...)


As an aside, I imagine that this is roughly how cloud providers 
implement their LBaaS/APIGateway solutions, as it's similar in spirit to 
the cloud-init+metadata-server setup they nearly all use for VM management.


Tristan



Re: [*EXT*] Re: Public-facing haproxy : recommandation about headers

2023-12-09 Thread Tristan

> On 8 Dec 2023, at 22:38, Ionel GARDAIS  
> wrote:
> 
> What about using %[hdr(host,1)] to forcefully use the first Host header if 
> multiple headers are sent ?

I just deny requests with multiple Host headers with an error message telling 
clients to fix their stuff, personally.

But yes, that works too

Tristan

Re: Public-facing haproxy : recommandation about headers

2023-12-08 Thread Tristan
Hi Ionel,

> On 8 Dec 2023, at 17:28, Ionel GARDAIS  
> wrote:
> 
> 
> Hi all,
> 
> Regarding a public facing haproxy, what would you recommend about headers 
> like X-Forwarded-* and Forwarded ?
> Delete them with http-request del-header ?
> Replace them with http-request set-header ?

Always delete, or replace. You have to be careful to *never* only append.

Otherwise the client sends X-Forwarded-For: foo, and your backend sees 
X-Forwarded-For: foo, 1.2.3.4

In that case your backend would likely (wrongfully) decide the client ip is « 
foo » (or whatever IP the client puts).

So typically I’d say to add to every single http frontend:

http-request set-header X-Forwarded-For %[src]
http-request set-header X-Forwarded-Host %[hdr(Host)]
http-request set-header X-Forwarded-Proto %[ssl_fc,iif(https,http)]
http-request set-header Forwarded 
"by=${HOSTNAME};for=%[src];host=%[hdr(Host)];proto=%[ssl_fc,iif(https,http)]"

Another one to be careful with is the difference between hdr(…) and fhdr(…). 
The first one splits on ",", which can give surprising results sometimes.

Finally, you might want to deny requests with a Via header. This typically 
indicates usage of a proxy. Maybe your clients legitimately use those, but more 
often than not this is just abuse of compromised routers for DoS attacks (at 
least in our case, it is).

> 
> Do the options forwardfor and forwarded behave like a set-header or a 
> add-header directive ? 

If I remember correctly, the baked-in directives work by *appending*, so 
they’re only ok for intermediate proxies, not for edge ones.

Tristan

Re: [ANNOUNCE] haproxy-2.9-dev10

2023-11-20 Thread Tristan
Hi Aleksandar,

> On 20 Nov 2023, at 17:18, Aleksandar Lazic  wrote:
> 
> at configuration Change the reload leaves the old processes alive until 
> "hard-stop-after" value and after that is the connection terminated which 
> does not looks like that the connection was takeover to the new process. The 
> use case was log shipping with HAProxy with mode tcp, as far as I have 
> understood the author in a proper way.

Is that new behavior? Because I was under the impression that this is by design

If the new process took over an existing L4 connection, it seems like it’d 
cause strange behavior in quite a few cases due to configuration changes.

Either haproxy tries to reuse all old values and essentially needs to fork the 
new process for takeover (which then is equivalent to the current old process 
living for a while), or it applies new values to the existing connection 
(assuming it’s even possible in all cases) which is likely to just break it 
(removed frontend, backend, or server; or timeouts changes, etc etc).

Seems like it’s just a design choice to me [1] and that HAProxy’s approach is 
sort of the only sane one…
Ofc that means potentially a lot of old processes, hence hard-stop-after and 
max-reloads as tunables.

Now in a k8s environment I can imagine high churn in pods causing a lot of 
server changes and making this a problem, but the official ingress controllers 
seems to generally mitigate it by using the runtime api when it can instead of 
hard reloads, and only using the latter in limited cases 樂
Maybe the used the « community » ingress controller (bless their maintainer, 
it’s not a jab at it) which does rely more on hard reloads

Either way, sounds unlikely to be a fix for it?

Tristan 

[1]: Also a bit out of topic but I always found ~infinite duration TCP 
connections to be a very strange idea… So many things can go wrong (and will go 
wrong) if you depend on it… at least it’s never going to be as reliable as 
client side retries or moving to UDP where possible…



Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)

2023-11-02 Thread Tristan

Hi,

On 02/11/2023 19:31, Valters Jansons wrote:

On Tue, Oct 24, 2023 at 10:35 AM Willy Tarreau  wrote:

We are running 2.9-dev8 for the server connection close fix for
"not-so-great" gRPC clients. We just experienced an ha_panic seemingly
triggered from OpenSSL 3. This is a fairly default Ubuntu 22.04
system, with locally built HAProxy package (as there are no "official"
dev builds).




It's worth checking if your build has the commit that fixes this SSL 
crash that was on 2.9-dev8 
https://github.com/haproxy/haproxy/issues/2329 (the fix is on master 
already but it might not be in your build?)


Regards,
Tristan



Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-24 Thread Tristan



On 23/10/2023 14:38, Tristan wrote:

Thanks both for your time helping me with it :)

On 23/10/2023 14:34, Aurelien DARRAGON wrote:

Just a side note regarding the comment from the 2nd patch: it's not
useful to mention the commit ID from the previous patch given that the
effective commit ID will change when the patch will be applied on top of
haproxy master branch.


Oh, I thought patch commit hashes carried through when applied. Well, TIL.


On that note, I assume the easiest is to let Willy alter the commit 
message to update (or remove) the commit reference when merging?


Asking just in case you were waiting for me to send an amended patch for it.

Tristan




Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-23 Thread Tristan

Thanks both for your time helping me with it :)

On 23/10/2023 14:34, Aurelien DARRAGON wrote:

Just a side note regarding the comment from the 2nd patch: it's not
useful to mention the commit ID from the previous patch given that the
effective commit ID will change when the patch will be applied on top of
haproxy master branch.


Oh, I thought patch commit hashes carried through when applied. Well, TIL.

Tristan





Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-23 Thread Tristan

Hi Willy,

On 23/10/2023 10:16, Willy Tarreau wrote:

No more comments, overall this looks good to me. Thus in summary, let's
try to avoid the ambiguity of "tune.lua.log" alone, and re-adjust the
bitfields. By the way, if we're having the same prefix "tune.lua.log"
for both options, it's an indication that we should likely have a dot
after to enable loggers (or any other name) and stderr:

 tune.lua.log.loggers on|off
 tune.lua.log.stderr on|auto|off



One being prefix of the other was also not great. So even if 
"log.loggers" is a little awkward, this is probably for the best after all.


So here's the 2 patches again. Hopefully they match what you wanted.

Two notes:
- I saw various forms of enum declarations in the codebase (naming it or 
not and typedef-ing or not) and I don't practice C enough to have an 
opinion on the matter, so I just picked the one I think looks nicest...
- For the regtests, if we don't want them to tee stderr to check it, 
then there's little point, and chance of breaking this all is also 
rather low anyway as you said, so none added


Regards,
Tristan
From 7ef333b8803c213ab1bd0a33a73faa30336ab55e Mon Sep 17 00:00:00 2001
From: Tristan 
Date: Mon, 23 Oct 2023 13:07:39 +0100
Subject: [PATCH] MINOR: lua: Add flags to configure logging behaviour

Until now, messages printed from LUA log functions were sent both to
the any logger configured for the current proxy, and additionally to
stderr (in most cases)

This introduces two flags to configure LUA log handling:
- tune.lua.log.loggers to use standard loggers or not
- tune.lua.log.stderr to use stderr, or not, or only conditionally

This addresses github feature request #2316

This can be backported to 2.8 as it doesn't change previous behaviour.
---
 doc/configuration.txt | 22 
 doc/lua-api/index.rst | 12 ---
 doc/lua.txt   |  4 +++
 src/hlua.c| 80 +--
 4 files changed, 111 insertions(+), 7 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index bb599bca9..433f15baf 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -1196,6 +1196,8 @@ The following keywords are supported in the "global" 
section :
- tune.lua.service-timeout
- tune.lua.session-timeout
- tune.lua.task-timeout
+   - tune.lua.log.loggers
+   - tune.lua.log.stderr
- tune.max-checks-per-thread
- tune.maxaccept
- tune.maxpollevents
@@ -3192,6 +3194,26 @@ tune.lua.task-timeout 
   remain alive during of the lifetime of HAProxy. For example, a task used to
   check servers.
 
+tune.lua.log.loggers { on | off }
+  Enables ('on') or disables ('off') logging the output of LUA scripts via the
+  loggers applicable to the current proxy, if any.
+  
+  Defaults to 'on'.
+
+tune.lua.log.stderr { on | auto | off }
+  Enables ('on') or disables ('off') logging the output of LUA scripts via
+  stderr.
+  When set to 'auto', logging via stderr is conditionally 'on' if any of:
+  
+- tune.lua.log.loggers is set to 'off'
+- the script is executed in a non-proxy context with no global logger
+- the script is executed in a proxy context with no logger attached
+  
+  Please note that, when enabled, this logging is in addition to the logging
+  configured via tune.lua.log.loggers.
+
+  Defaults to 'on'.
+
 tune.max-checks-per-thread 
   Sets the number of active checks per thread above which a thread will
   actively try to search a less loaded thread to run the health check, or
diff --git a/doc/lua-api/index.rst b/doc/lua-api/index.rst
index 8a29bc5b5..90802cfd4 100644
--- a/doc/lua-api/index.rst
+++ b/doc/lua-api/index.rst
@@ -267,8 +267,10 @@ Core class
   **context**: body, init, task, action, sample-fetch, converter
 
   This function sends a log. The log is sent, according with the HAProxy
-  configuration file, on the default syslog server if it is configured and on
-  the stderr if it is allowed.
+  configuration file, to the loggers relevant to the current context and
+  to stderr if it is allowed. 
+  
+  The exact behaviour depends on tune.lua.log.loggers and tune.lua.log.stderr.
 
   :param integer loglevel: Is the log level associated with the message. It is 
a
number between 0 and 7.
@@ -2648,8 +2650,10 @@ TXN class
 .. js:function:: TXN.log(TXN, loglevel, msg)
 
   This function sends a log. The log is sent, according with the HAProxy
-  configuration file, on the default syslog server if it is configured and on
-  the stderr if it is allowed.
+  configuration file, to the loggers relevant to the current context and
+  to stderr if it is allowed. 
+  
+  The exact behaviour depends on tune.lua.log.loggers and tune.lua.log.stderr.
 
   :param class_txn txn: The class txn object containing the data.
   :param integer loglevel: Is the log level associated with the message. It is
diff --git a/doc/lua.txt b/doc/lua.txt
index 8d5561668..25db8a304 100644
--- a/doc/lua.txt
+++ b/doc/

Re: Some filter discussion for the future

2023-10-21 Thread Tristan



On 19/10/2023 21:53, Aleksandar Lazic wrote:


As I'm quite newbie at WASM I will mainly create any "echo all params" 
file in shell/perl/go/js or any other language and convert it to WASM :-).


Exactly the same here! And that's why it is good. You can "just" pick 
your favorite one [provided a transpiler exists] and roll with it :-)


Though your point about whether it's a fad or not is fair. It's always 
hard to say, and borad WASM adoption has been "coming" for a long time 
already.


Well that's another option to add a c bases wasm runtime into addons 
directory similar to promex.


Sounds quite reasonable to start with an addon indeed.

Agree that a outh2 filter would be very nice. I have also thought about 
that but it's not an easy task, at least for me, because the oauth2 flow 
is quite complex


Implementing the full OAuth2 flow is non-trivial, but also not THAT bad 
from a client's perspective. It's just (rightfully) out of scope for 
HAProxy. And it sticking to providing "only" JWT parsing+querying makes 
a lot of sense.


But nonetheless, at the moment the best option is something like the 
very popular https://github.com/oauth2-proxy/oauth2-proxy on the side, 
which doesn't integrate particularly nicely with HAProxy, so being able 
to bake things in a 3rd-party extension at runtime would be much nicer.


and there are very nice tools out there like keycloak 
which handles this task very well.


This is a tangent, but I feel compelled to expand on Keycloak a bit.

First, it deserves praise, for being a very mature (and very 
full-featured) OpenID+SAML+OAuth2+... server.
And pretty much the only (mature one) that is also open source and 
maintained.
And even amongst newer ones, the list of similar projects that are 
"serious" [1] is very short (Ory's suite is essentially the only one I 
know of at the moment).


Which is why we decided on it and deployed it ourselves a year ago, and 
have used it ever since.


But there are significant limitations we wish we knew at the time:
- Multi-node session replication (think HAProxy stick-table peering) has 
reliability issues particularly around rebalancing, causing regular 
session loss if you depend on it too much (for example with autoscaling) 
https://github.com/keycloak/keycloak/issues/14040
- There is no such thing as transparent rolling updates. Any update, 
even minor/patch ones, means disconnecting everyone. (in-memory-only 
user sessions + versioning of these + no version migration not even for 
n->n+1)
- No public API, so theme-ing means one of these absolutely 
soul-crushing template systems, or a CSS hackjob full of !important rules
- And a lot of "surprising" quirks you just have to know about, because 
it was clearly meant more for internal corporate use originally


That said, it does work well otherwise, with decent performance even at 
fairly large scale (400k+ active sessions in a single realm). So it is 
still a decent solution. I'm just not sure I'd pick it again today if I 
had to choose.


Tristan

[1]: By serious, I mean intended for usage when malicious clients are 
not just a theoretical threat. So with a very defensive design, 
comprehensive error handling that always fails closed, etc. There is an 
endless list of "friendlier" options if those are not things one 
desires, and that doesn't make them bad. Tis just a tradeoff.




Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-20 Thread Tristan
Following up on Aurélien's remarks (thanks for catching my forgetting 
it!), here's an additional patch to update the LUA-specific documentation.


Could be kept standalone or merged into the first patch, but to avoid 
re-submitting the patchset already, here it is standalone for now.


Tristan
From 67c3177c51044d288036044f7d17f6f037ac4f55 Mon Sep 17 00:00:00 2001
From: Tristan 
Date: Fri, 20 Oct 2023 18:59:03 +0100
Subject: [PATCH] DOC: lua: update core.log and TXN.log documentation

Mention that their behaviour depends on the tunables
introduced in commit 45e6f27140 ("MINOR: lua: Add
flags to configure logging behaviour")

Rather than being entirely precise here and risk drift, simply
refer to the relevant tunables for details.

This should be backported wherever the aforementionned commit is
---
 doc/lua-api/index.rst | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/doc/lua-api/index.rst b/doc/lua-api/index.rst
index 8a29bc5b5..9b9dcbd2c 100644
--- a/doc/lua-api/index.rst
+++ b/doc/lua-api/index.rst
@@ -267,8 +267,10 @@ Core class
   **context**: body, init, task, action, sample-fetch, converter
 
   This function sends a log. The log is sent, according with the HAProxy
-  configuration file, on the default syslog server if it is configured and on
-  the stderr if it is allowed.
+  configuration file, to the loggers relevant to the current context and/or
+  stderr. 
+  
+  The exact behaviour depends on tune.lua.log and tune.lua.log-stderr tunables.
 
   :param integer loglevel: Is the log level associated with the message. It is 
a
number between 0 and 7.
@@ -2648,8 +2650,10 @@ TXN class
 .. js:function:: TXN.log(TXN, loglevel, msg)
 
   This function sends a log. The log is sent, according with the HAProxy
-  configuration file, on the default syslog server if it is configured and on
-  the stderr if it is allowed.
+  configuration file, to the loggers relevant to the current context and/or
+  stderr. 
+  
+  The exact behaviour depends on tune.lua.log and tune.lua.log-stderr tunables.
 
   :param class_txn txn: The class txn object containing the data.
   :param integer loglevel: Is the log level associated with the message. It is
-- 
2.41.0



Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-20 Thread Tristan

Hi all again,

Here is the updated patch set after changes based on feedback received.
The change is now split across 2 patches.

Patch 0001 adding:
- tune.lua.log { on | off } (defaults to 'on') for usage of loggers
- tune.lua.log-stderr { on | auto | off } (defaults to 'on') for usage 
of stderr


Effectively no change to current behaviour, hence fit for backport.

Patch 0002 changing:
- tune.lua.log-stderr default from 'on' to 'auto'

Changes behaviour, hence no backport.

I guess my only remaining question is whether we want regtests for this?

I'm be happy to write some if yes, but I know regtests runtime is a 
concern and it'd require a bit of test infra work to make capturing 
stderr output, so I just want to be sure it is desirable before :-)


Regards,
Tristan
From 45e6f271404d20479284fc3e0e1f7448e260e016 Mon Sep 17 00:00:00 2001
From: Tristan 
Date: Fri, 20 Oct 2023 16:31:58 +0100
Subject: [PATCH] MINOR: lua: Add flags to configure logging behaviour

Until now, messages printed from LUA log functions were sent both to
the any logger configured for the current proxy, and additionally to
stderr (in most cases)

This introduces two flags to configure LUA log handling:
- tune.lua.log to use standard loggers or not
- tune.lua.log-stderr to use stderr, or not, or only conditionally

This addresses github feature request #2316

This can be backported to 2.8 as it doesn't change previous behaviour.
---
 doc/configuration.txt | 22 
 doc/lua.txt   |  4 +++
 src/hlua.c| 80 +--
 3 files changed, 103 insertions(+), 3 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 50fe882d0..7c4a585ec 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -1196,6 +1196,8 @@ The following keywords are supported in the "global" 
section :
- tune.lua.service-timeout
- tune.lua.session-timeout
- tune.lua.task-timeout
+   - tune.lua.log
+   - tune.lua.log-stderr
- tune.max-checks-per-thread
- tune.maxaccept
- tune.maxpollevents
@@ -3192,6 +3194,26 @@ tune.lua.task-timeout 
   remain alive during of the lifetime of HAProxy. For example, a task used to
   check servers.
 
+tune.lua.log { on | off }
+  Enables ('on') or disables ('off') logging the output of LUA scripts via the
+  loggers applicable to the current proxy, if any.
+  
+  Defaults to 'on'.
+
+tune.lua.log-stderr { on | auto | off }
+  Enables ('on') or disables ('off') logging the output of LUA scripts via
+  stderr.
+  When set to 'auto', logging via stderr is conditionally 'on' if any of:
+  
+- tune.lua.log is set to 'off'
+- the script is executed in a non-proxy context with no global logger
+- the script is executed in a proxy context with no logger attached
+  
+  Please note that, when enabled, this logging is in addition to the logging
+  configured via tune.lua.log.
+
+  Defaults to 'on'.
+
 tune.max-checks-per-thread 
   Sets the number of active checks per thread above which a thread will
   actively try to search a less loaded thread to run the health check, or
diff --git a/doc/lua.txt b/doc/lua.txt
index 8d5561668..8d244ab3a 100644
--- a/doc/lua.txt
+++ b/doc/lua.txt
@@ -630,6 +630,10 @@ It displays a log during the HAProxy startup:
 
[alert] 285/083533 (14465) : Hello World !
 
+Note: By default, logs originating from a LUA script are sent to the loggers
+applicable to the current context, if any, and additionally to stderr. See
+tune.lua.log and tune.lua.log-stderr for more information.
+
 Default path and libraries
 --
 
diff --git a/src/hlua.c b/src/hlua.c
index e94325727..70a25e762 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -69,6 +69,20 @@
 #include 
 #include 
 
+/* Global LUA flags */
+/* log handling */
+#define HLUA_TUNE_LOG 0x0001 /* send to current logger */
+#define HLUA_TUNE_LOG_STDERR_ON   0x0010 /* send to stderr */
+#define HLUA_TUNE_LOG_STDERR_AUTO 0x0020 /* send to stderr if no logger in 
use */
+#define HLUA_TUNE_LOG_STDERR_OFF  0x0040 /* never send to stderr */
+#define HLUA_TUNE_LOG_STDERR_MASK 0x0070
+
+/* default flag values */
+#define HLUA_TUNE_FLAGS_DFLT 0x0011
+
+/* flags made of HLUA_TUNE_* */
+static uint hlua_tune_flags = HLUA_TUNE_FLAGS_DFLT;
+
 /* Lua uses longjmp to perform yield or throwing errors. This
  * macro is used only for identifying the function that can
  * not return because a longjmp is executed.
@@ -1366,8 +1380,9 @@ const char *hlua_show_current_location(const char *pfx)
return NULL;
 }
 
-/* This function is used to send logs. It try to send on screen (stderr)
- * and on the default syslog server.
+/* This function is used to send logs. It tries to send them to:
+ * - the log target applicable in the current context, AND
+ * - stderr if not in quiet mode or explicitly disabled
  */
 static inline void hlua_sendlog(struct proxy *px, int level, const char *msg)
 {
@@ -1392,

Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-20 Thread Tristan



On 20/10/2023 15:30, Tristan wrote:

ie in this snippet (hlua.c:1387):

static inline void hlua_sendlog(struct proxy *px, ...)
{
 ...
 if (... && (!LIST_ISEMPTY(>loggers)))
     return;

has the following results:
- locally from source => compiles happily
- locally from clone + patch => compile error
- in CI from clone + patch => compile error

When it fails, this is what I get back:

src/hlua.c:1417:86: error: ‘struct proxy’ has no member named ‘loggers’
  1417 | if ((hlua_tune_flags & (HLUA_TUNE_LOG)) 
&& (!LIST_ISEMPTY(>loggers)))

   |  ^~
include/haproxy/list.h:102:28: note: in definition of macro ‘LIST_ISEMPTY’
   102 | #define LIST_ISEMPTY(lh) ((lh)->n == (lh))
   |    ^~

Which would make sense if struct proxy had no such member, but it most 
certainly has one...


I must be missing something obvious, which hopefully someone can point 
out to me...


Well, Aurélien replied off-list and turns out my build scripts were not 
using the same exact commit I was writing it against... 
https://github.com/mangadex-pub/haproxy/commit/46c15afd27876c3e260fc0c70234e0aff70422de#r130521133


Thanks Aurélien, and :facepalm: to me

Tristan




Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-20 Thread Tristan

Hi again Willy,

On 18/10/2023 07:47, Willy Tarreau wrote:

[...]

 maybe we can have a 3rd value "auto" which would be the default
 and which would only log to stderr if there's no other logger ? I
 don't know if we have this info where it's used, though. Hmmm at
 first glance we seem to have it by testing if either px->loggers
 is non-empty or global.loggers is non-empty. Thus it could be the
 nicest approach, having "on" by default in 2.8 and older and "auto"
 on 2.9 maybe ?


I did go that route and thus split it into 2 patches (one for the flags 
+ 'on' for that, and one that swaps from 'on' to 'auto').


Also from my reading of the config parser it seems to me like the global 
loggers are always appended to the proxy's loggers if 'log global' is used.
So I opted to only check the proxy's loggers since I can see no 
situations where one has 1) some global loggers and 2) a proxy that 
expressly doesn't inherit log global and 3) nor any other proxy-specific 
logger target.


Anyway, that small question aside, I'm seeing some pretty strange 
discrepancy when compiling in different places.


ie in this snippet (hlua.c:1387):

static inline void hlua_sendlog(struct proxy *px, ...)
{
...
if (... && (!LIST_ISEMPTY(>loggers)))
return;

has the following results:
- locally from source => compiles happily
- locally from clone + patch => compile error
- in CI from clone + patch => compile error

When it fails, this is what I get back:

src/hlua.c:1417:86: error: ‘struct proxy’ has no member named ‘loggers’
 1417 | if ((hlua_tune_flags & (HLUA_TUNE_LOG)) 
&& (!LIST_ISEMPTY(>loggers)))
  | 
 ^~

include/haproxy/list.h:102:28: note: in definition of macro ‘LIST_ISEMPTY’
  102 | #define LIST_ISEMPTY(lh) ((lh)->n == (lh))
  |^~

Which would make sense if struct proxy had no such member, but it most 
certainly has one...


I must be missing something obvious, which hopefully someone can point 
out to me...


Regards,
Tristan

nb: don't want to waste your time by forcing to review the patches 
before I had a chance to properly test them, but if seeing them helps 
more than the error, they are here atm: 
https://github.com/mangadex-pub/haproxy/tree/93c92592bc693b53a27305152de80f544664f49d/haproxy/patches-dev




Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-18 Thread Tristan

By the way, since we're talking about it, I think that actually it's not
logs that we're sending to stderr, it's Lua's alert messages that we're
sending both to stderr and logs. 


Well I am no expert in what qualifies as "logs" vs "alerts", but the 
messages sent from (for example) txn:Info() are included in this logic 
(which is why I cared in the first place).


If that is an alert, then it's fine, but I would not have expected it.


So maybe it would be more convenient with these two choices:

  tune.lua.alerts-log { on | off }
  tune.lua.alerts-stderr { on | off }

Or with this single one:

  tune.lua.alerts { off | stderr | log | both }

But please keep in mind the default that would change between 2.8 and 2.9
and that would need to be placed at the right location. The last one makes
it less easy in this regard.


I feel like the 2-different-flags approach is a little bit better here, 
so I'll go with that. Since it also gives the option if ever needed to 
relatively cleanly extend things later. For example:

- tune.lua.log { off | on [log $logger] [log-format $logfmt] }
- tune.lua.log-stderr { off | on | [min-level $level] }

Not that I necessarily plan either, but the first one would be pretty 
neat, potentially.



Do what's easier for you. Really. Make sure the final version contains
a patch and a commit message that apply easily (either all inline or
attached, as you see fit). The only thing to avoid is to send multiple
series for review before getting feedback, that's extremely confusing,
but fortunately this almost never happens here.


Ok. Probably will reply to first message of this thread with 2 diff 
files (one for 2.8 and one for 2.9 to account for default config 
difference) then.


Tristan




Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-18 Thread Tristan





...

One last thing, SEND_ERR() reports to stderr through ha_alert() and
hlua_sendlog() does it through fprintf(stderr, ) by appending a static
header containing the log level, the date and the pid: maybe we should
try to unify those outputs as well?


I'm not sure anyone really *wants* to have to configure all that stuff
for error reporting. Maybe the format is not great, maybe it should use
send_log() etc, quite frankly I don't know as I have not visited that
part. [...]


To be honest here I'm somewhat on Aurélien's side. I would actually 
appreciate a lua-log-format sort of configurable option.


Mainly so I can prefix (for example) with the timestamp and request id, 
like I do for HTTP traffic.


Since raw output means it looks a bit odd in logs, and enforcing 
prefixes inside scripts is a bit of a maintenance burden (and error 
prone + expensive).


That said, I'd definitely do that in a follow-up patch.

Tristan



Re: [PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-18 Thread Tristan

Hi Willy,


On 18/10/2023 07:47, Willy Tarreau wrote:

Hi Tristan,

...

I'm fine with the general approach, but I'm having two comments:

   - using the word "also" in the option name is really not welcome
 ("tune.lua.also-log-to-stderr"), and actually more confusing than
 without because one might wonder "why also, where is it also sent".


Funny you say that, because that's exactly what I intended by using 
"also". Since what is surprising to me about this behavior is less that 
it prints to stderr and more that it ADDITIONALLY prints to stderr. 
Which is why I thought emphasizing it made sense.



 Most of our options are compatible and cumulable, and not exclusive,
 so it should be seen as a boolean like any other one. As such I
 would just call it "tune.lua.log-stderr" and be done with it. That
 may even allow you not to resize the keywords array, which could
 help with backports ;-)


Not resizing the array is ideal ideed, but I'd argue that "log-stderr" 
is a little bit misleading, since it's really not a switch between two 
output streams, but rather a switch for a second one (stderr) being also 
written to.


Though frankly I don't care THAT much so if everyone's fine with it, I 
also am.


For the record, I agree it is ugly wording though, but I looked for what 
similar flags existed and how other software defined them, and 
"alsologtostderr" seems to be the common approach these days (though imo 
such a flag shouldn't even exist in the first place).




   - should we really stick to "on" as the default behavior in 2.9 ? I


I'd be up for that, but I expected to be told off. If no one objects I'm 
more than happy to make it default. Or rather call it 
"tune.lua.log-stderr" as you propose, and default that one to "off".



 sense that basically nobody wants that by default anymore, and if
 we want to change the default, it will only be in an odd version,
 hence 3.1 for the next one. Maybe now's the right moment ? Or if
 the concern is to lose the messages when no logs are configured,
 maybe we can have a 3rd value "auto" which would be the default
 and which would only log to stderr if there's no other logger ? I
 don't know if we have this info where it's used, though. Hmmm at
 first glance we seem to have it by testing if either px->loggers
 is non-empty or global.loggers is non-empty. Thus it could be the
 nicest approach, having "on" by default in 2.8 and older and "auto"
 on 2.9 maybe ?


I suppose that work too. I actually never ran HAProxy without any log 
config and thought defaults would be equivalent to:


global
  log stderr format raw local0

default
  log global

(ie that at runtime logger fields were always set)



A few comments on the patch:


[...]

+static int hlua_tune_flags = HLUA_TUNE_ALSO_LOG_TO_STDERR;


When you're using such arrays of flags, please precede them with a short
comment saying "flags made of HLUA_TUNE_*" or something like this so that
if it ever grows and more stuff gets inserted in between, it remains easy
to figure (that's one issue that has affected all other ones over time).


Fair point. Feared it'd be a bit repetitive with the comment just above, 
but I see what you mean.


To be honest I wasn't even sure where in the file I should put it. Or if 
it was preferable to add it in global.tune.options.


But I saw above that struct a comment saying it needed to be redone 
"properly" so I figured I shouldn't add to it.




Also for bit fields, I'd prefer to use an unsigned int (we have "uint"
as a short form) so that when you see them in gdb you don't get negative
values that are even more annoying to copy-paste and decode.


Ah, sure.
Thought so too, but guessed that there was a reason for signed ints 
since global.tune.options uses that.





+static int hlua_also_log_to_stderr(char **args, int section_type, struct proxy 
*curpx,
+   const struct proxy *defpx, const char 
*file, int line,
+   char **err)


It's not obvious at all that this function is there to parse a keyword,
I'm seeing it as the one which will actually log. Almost all of our
keyword parsing functions (others historically have "check"). I'm
seeing that it's not the case for the majority of the historic Lua
functions, but this should not be a reason for not fixing it (and
maxmem was fixed already). Better call it "hlua_parse_log_stderr" or
something like this that reminds the keyword.


Got it. Definitely agree, and only went that route for consistency at 
first 



Please have a look at these points (or comment if you disagree), and
I'll happily merge it!


I'll follow up with the requested changes, however I'm not familiar with 
patches over mailing lists. Should the updated patch be sent in its own 
thread or as a reply of this with the diff file in attachment?


Tristan



[PATCH] MINOR: lua: Add a flag to disable logging to stderr

2023-10-17 Thread Tristan

By default, messages printed from LUA log functions are sent both to
the configured log target and additionally to stderr (in most cases).
This introduces tune.lua.also-log-to-stderr for disabling that
second copy of the message being sent to stderr.

Addresses https://github.com/haproxy/haproxy/issues/2316

This could be backported if wanted, since it preserves the behaviour
that existed prior to it.
---
 doc/configuration.txt |  6 ++
 doc/lua.txt   |  4 
 src/hlua.c| 50 +--
 3 files changed, 49 insertions(+), 11 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 88a576795..771a569c0 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -1195,6 +1195,7 @@ The following keywords are supported in the 
"global" section :

- tune.lua.service-timeout
- tune.lua.session-timeout
- tune.lua.task-timeout
+   - tune.lua.also-log-to-stderr
- tune.max-checks-per-thread
- tune.maxaccept
- tune.maxpollevents
@@ -3180,6 +3181,11 @@ tune.lua.task-timeout 
   remain alive during of the lifetime of HAProxy. For example, a task 
used to

   check servers.

+tune.lua.also-log-to-stderr { on | off }
+  Enables ('on') or disables ('off') logging the output of lua log 
functions
+  to stderr in addition to the configured log target. To preserve 
historical

+  behaviour, this defaults to 'on'.
+
 tune.max-checks-per-thread 
   Sets the number of active checks per thread above which a thread will
   actively try to search a less loaded thread to run the health check, or
diff --git a/doc/lua.txt b/doc/lua.txt
index 8d5561668..5e5712938 100644
--- a/doc/lua.txt
+++ b/doc/lua.txt
@@ -630,6 +630,10 @@ It displays a log during the HAProxy startup:

[alert] 285/083533 (14465) : Hello World !

+Note: By default, logs created from a LUA script are printed to the log 
target

+in your configuration and additionally to stderr, unless the flag
+tune.lua.also-log-to-stderr is set to 'off'.
+
 Default path and libraries
 --

diff --git a/src/hlua.c b/src/hlua.c
index c686f222a..261aee763 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -69,6 +69,12 @@
 #include 
 #include 

+/* Global LUA on/off flags */
+/* if on, LUA-originating logs are duplicated to stderr */
+#define HLUA_TUNE_ALSO_LOG_TO_STDERR (1<<0)
+
+static int hlua_tune_flags = HLUA_TUNE_ALSO_LOG_TO_STDERR;
+
 /* Lua uses longjmp to perform yield or throwing errors. This
  * macro is used only for identifying the function that can
  * not return because a longjmp is executed.
@@ -1366,8 +1372,9 @@ const char *hlua_show_current_location(const char 
*pfx)

return NULL;
 }

-/* This function is used to send logs. It try to send on screen (stderr)
- * and on the default syslog server.
+/* This function is used to send logs. It tries to send them to:
+ * - the log target applicable in the current context, AND
+ * - stderr if not in quiet mode or explicitly disabled
  */
 static inline void hlua_sendlog(struct proxy *px, int level, const 
char *msg)

 {
@@ -1394,6 +1401,9 @@ static inline void hlua_sendlog(struct proxy *px, 
int level, const char *msg)


send_log(px, level, "%s\n", trash.area);
 	if (!(global.mode & MODE_QUIET) || (global.mode & (MODE_VERBOSE | 
MODE_STARTING))) {

+ if (!(hlua_tune_flags & (HLUA_TUNE_ALSO_LOG_TO_STDERR)))
+   return;
+
if (level == LOG_DEBUG && !(global.mode & MODE_DEBUG))
return;

@@ -12433,6 +12443,23 @@ static int hlua_parse_maxmem(char **args, int 
section_type, struct proxy *curpx,

return 0;
 }

+static int hlua_also_log_to_stderr(char **args, int section_type, 
struct proxy *curpx,
+   const struct proxy *defpx, const 
char *file, int line,

+   char **err)
+{
+   if (too_many_args(1, args, err, NULL))
+   return -1;
+
+   if (strcmp(args[1], "on") == 0)
+   hlua_tune_flags |= HLUA_TUNE_ALSO_LOG_TO_STDERR;
+   else if (strcmp(args[1], "off") == 0)
+   hlua_tune_flags &= ~HLUA_TUNE_ALSO_LOG_TO_STDERR;
+   else {
+		memprintf(err, "'%s' expects either 'on' or 'off' but got '%s'.", 
args[0], args[1]);

+   return -1;
+   }
+   return 0;
+}

 /* This function is called by the main configuration key "lua-load". 
It loads and
  * execute an lua file during the parsing of the HAProxy configuration 
file. It is
@@ -12673,15 +12700,16 @@ static int hlua_config_prepend_path(char 
**args, int section_type, struct proxy


 /* configuration keywords declaration */
 static struct cfg_kw_list cfg_kws = {{ },{
-   { CFG_GLOBAL, "lua-prepend-path", hlua_config_prepend_path },
-   { CFG_GLOBAL, "lua-load", hlua_load },
-   { CFG_GLOBAL, "lua-load-per-thread",  hlua_load_per_thread },
-   { CFG_GLOBAL, "tune.lua.session-timeout", hlua_session_timeout },
- 

Re: Some filter discussion for the future

2023-10-17 Thread Tristan
l rather than randomly tweaking things around until it doesn't 503 
anymore because something's nil-able and you didn't handle it.


> [...] for example I remember about JWT which stared in Lua and is
> now native

Saw that yes, and using the native JWT things atm myself. I'd note 
however that while it's great to have it, it's still a little bit 
awkward in practice, because of the number of fields to check. As in 
proper validation of iss, aud, exp, and keys turns out to be very 
verbose in the haproxy configuration.


Not saying it's a bad thing in any way, mind you. And it's great to have 
it. But it's not quite the same as a theoretical
http-request filter oauth2 
https://auth.foo.bar/.well-known/openid-configuration


Tristan



Re: Options for mitigating CVE-2023-44487 with HAProxy

2023-10-10 Thread Tristan
Thanks for looking into it, Willy :-)

> On 10 Oct 2023, at 19:24, Willy Tarreau  wrote:
>  […]
> But for now if you site requires any of this, I can't see how it has not
> experienced weekly outages from standard attacks.

Funny that you mention this; bit out of topic but we had enjoyed a relatively 
peaceful 2023 for the most part, after some law enforcement operations last 
December/January. [1]

But things are picking up pace again since mid-summer, and this is indeed just 
one of what feels like a new attack method a week that we hear of or have the 
displeasure to receive…
Though most seem to be either L4 or non-HTTP L7.

> As Tristan mentioned, lowering tune.h2.be.max-concurrent-streams may
> also slow them down but it will also slow down some sites with many
> objects (think shops with many images). For a long time the high
> parallelism of H2 was sold as a huge differentiator, I don't feel like
> starting to advertise lowering it now.

For others that may be tempted to copypaste it and seeing errors, I think you 
meant tune.h2.fe.max-… (emphasis on « fe »)

> There are lots of other interesting attacks on the H2 protocol, that
> can be triggered just with a regular client with low timeouts, with low
> stream windows (use h2load -w 1 to have fun), zero-window during transfers,
> and even playing with one-byte continuation frames that may force some
> components to perform reallocations and copies.

Are protections for these baked into the defaults/protocol or is there some 
reading+tweaking one could consider for hardening purposes? As in even at the 
cost of compatibility with « odd » clients.

Regards,
Tristan

[1] 
https://krebsonsecurity.com/2022/12/six-charged-in-mass-takedown-of-ddos-for-hire-sites/

Re: [ANNOUNCE] haproxy-2.9-dev7

2023-10-10 Thread Tristan



On 10/10/2023 14:04, Aleksandar Lazic wrote:

...
Well this implies that always a dpapi should run together with HAProxy 
if you want something like DNS resolving for server or anything else? 


I don't think Willy meant removing this much; from a previous discussion 
with him on the topic, it sounds more like wanting to remove things like 
regular SRV records resolution for server-template directives for example.



I think that the DNS Stuff should be keep there and maybe be enhanced as
it looks to me some new Security Topics are using DNS more and more like
ESNI, ECH, SVB, ...


Neither of those really matter for HAProxy as a server though. ESNI/ECH 
merely has the relevant keys published on the domain's HTTPS or SVCB 
records, which (as far as I understand it) are there mostly to have more 
protocol hints baked into the preliminary DNS request (ie v4/v6 support, 
and which IPs for each; which ECH keys, what ports, what ALPNs, ...).


But all of this is baked into your public DNS zone, and not really 
something HAProxy (as a server, again) would ever use.


Now that said, if you mean using those between HAProxy and your 
backends... maybe. Though I'm a little bit dubious about the use-case, 
to be frank, since it's not expected for backend servers to use things 
like ECH.


That said, I do have some use-cases at the moment where I actively make 
use of SRV records on the backends internally, for which losing support 
would be a little annoying, so I can appreciate the will to keep them.


Should this be handled by dpapi and configured via socket api or any 
upcoming api in HAProxy
If dpapi is to become necessary for very dynamic configurations, I just 
hope it follows a standard rather than expecting users to write adapters 
around since it's unlikely that anything but xDS (or similar) will have 
significant adoption in the near future due to the rather high cost of 
supporting third-party APIs like those for various projects.


Somewhat the same reason why while the socket api is cool and useful, it 
doesn't really work out as nicely as it could when it comes to 
integrating it with other software.


Tristan



Options for mitigating CVE-2023-44487 with HAProxy

2023-10-10 Thread Tristan

Hi all,

This just got disclosed:
- 
https://cloud.google.com/blog/products/identity-security/google-cloud-mitigated-largest-ddos-attack-peaking-above-398-million-rps/
- 
https://cloud.google.com/blog/products/identity-security/how-it-works-the-novel-http2-rapid-reset-ddos-attack


Seems like a clever update to the "good old" h2 multiplexing abuse vectors:
1. client opens a lot of H2 streams on a connection
2. Spams some requests
3. immediately sends h2 RST frames for all of them
4. Go back to 1. repeat.

The idea being to cause resource exhaustion on the server/proxy at least 
when it allocates stream related buffers etc, and the underlying server 
too since it likely sees the requests before they get cancelled.


Looking at HAProxy I'd like to know if someone's aware of a decent 
mitigation option?


My current first guess was to adapt F5's recommendations here 
https://my.f5.com/manage/s/article/K000137106, ie set 
tune.h2.fe.max-concurrent-streams 10


But I'm not sure if there's a way to apply the suggested measure (by 
Google this time) of catching misbehaving clients (opening more streams 
that advertised, or sending a lot of reset frames).


Finally I'll note that I don't currently have access to a good way to 
test how much it affects HAProxy. And maybe it doesn't due to some 
peculiar implementation detail. But that would be good to know.


Regards,
Tristan




Re: [ANNOUNCE] haproxy-2.9-dev7

2023-10-08 Thread Tristan
Since this was brought up, 

> On 7 Oct 2023, at 14:34, Willy Tarreau  wrote:
> 
> […]
> 
>> Maybe this will then bring up SPOE to a level where the body of a request
>> can be scanned and bring it to a full WAF level or as WASM filter.

Any thoughts on the feasibility of a WASM based alternative to the current LUA 
platform?

From what I looked there are a few WASM runtimes set up for being embedded in C 
applications, though I’m not expert enough on the tradeoffs of each to know if 
there are dealbreakers.

I also realize that a lot of work went into the current LUA support (a long at 
the frighteningly long .c file for it speaks volumes).

But on one hand I find it rather difficult to use correctly in its current 
state, in part because of the complete absence (to my knowledge) of something 
equivalent to C headers for validation ahead of deployment, and also in part 
(and more personally) because I never understood what anyone could possibly 
like about LUA itself…

> […]
> 
>> Are there any plans to have something similar to XDS (
>> https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol ) for
>> dynamic configs at runtime, similar to the socket api and Data Plane API?
> 
> I used to have such plans a long time ago and even participated to a
> few udpapi meetings. But at this point I think it's not haproxy's job
> to perform service discovery

And that’s a very fair point. I wonder however how feasible it will 
realistically be from dpapi’s perspective to add that to its remit.

That said I’d definitely be very interested as well. As much as handcrafted 
configurations are nice, one quickly reaches their maintainability limits. And 
if we’re to stop abusing DNS again and again, proper service discovery is the 
way.

Tristan






Re: Stalled H3/QUIC Connections with Packet Loss

2023-09-24 Thread Tristan
That said Luke, even if this fixed the issue of this thread, it would be 
helpful if you (and other people using QUIC or considering it) reported on how 
it runs for you.

Because at the moment I have a couple of suspected bugs that I lack 
reproducibility and diagnostic info to report, and some performance issues that 
might be my fault rather than HAProxy’s, which is always very hard to 
distinguish for performance metrics.

I think we should make a how-to-help wiki page btw, since the information is 
mostly spread across github issues, but the main 3 points would be:
- to build with ASAN [1]
- to enable traces (quic, qmux, and h3) using a ring in /dev/shm [2]
- to enable core dumps for haproxy [3]

1: Use clang if you don’t build haproxy directly on the runtime host btw. GCC 
doesn’t actually support static linking if ASAN. Longer rant on it here 
https://github.com/haproxy/haproxy/issues/2120#issuecomment-1515405340 and set 
the following environment variable for the HAProxy process so coredumps don’t 
get prevented : 
ASAN_OPTIONS="disable_coredump=0:unmap_shadow_on_exit=1:abort_on_error=1"

2: For traces, You want to set up a ring (see haproxy doc) on a ram-backed 
directory (typically /dev/shm/haproxy-quic for example) with at least 1 or 2 
minutes worth of traces in size (size depends on # of rqps mostly, with 350rqps 
that works out to 120MB for 2.5 minutes or so here). This is where low-level 
detailed logs will be written, in a circular ram-backed file. Using disk-based 
files or stdout will just waste resources if it even sustains the throughput, 
so def not a good idea to handwave that part.

In the end, it’ll look something like this:
ring traces
format timed
maxlen 3072
size 134217728 #128MB
backing-file /dev/shm/haproxy-quic

Then in the global section:
traces h3 level developer verbosity minimal sink traces
traves qmux level developer verbosity minimal sink traces
traces quic level developer sink traces

3: Here it is supposed to be trivial but alas. First you need to check the 
core_pattern sysctl. This is either a command or an output file relative to the 
execution directory of HAProxy *processes* (not the root process; so if you use 
a chroot, it is a path within the chroot). You need to have it writable by 
those haproxy subprocesses, and then you need to check that ulimit -c (max core 
size) is big enough for haproxy to dump fully (it’s often set to 0 out of the 
box), and check that systemd doesn’t override it if you use it (it does, out of 
the box, and you need a systemd unit override like LimitCORE=infinity)

Then with all this, when HAProxy crashes, you’ll have:
1. ASAN logs if applicable via stdout/stderr (I forget which, and the 
distinction remains one of the great pains of IT) which you can demangle using 
the asan symbolicate python script in clang’s source tree
2. Low level traces in /dev/shm/haproxy-quic.bak (on process restart the old 
ramfile is mv’d with a .bak suffix)
3. Your coredump in whatever path is in core_pattern

Alas, that is only for crash debugging.
For performance there’s no really good solution atm.
I just monitor the closest prometheus metric to %Tu 
(average_total_time_seconds) alongside process uptimes.
And how often users complain about issues, but that’s not the most reliable 
metric… Something like the NEL spec is the more scientific approach to this by 
far https://web.dev/network-error-logging/

Hopefully in the coming days I find time to compile this long mail in a 
slightly more organized wiki article with snippets etc, but it will do in the 
meantime 

Tristan

> On 22 Sep 2023, at 16:07, Amaury Denoyelle  wrote:
> On Fri, Sep 22, 2023 at 03:30:58PM +0200, Luke Seelenbinder wrote:
>> If it's any help, here's `show quic full` for a stalled connection:
>> [...]
> 
> Tristan has been right, as we saw here fd=-1 meaning that there is
> probably a permission issue for bind() on connection sockets. You should
> have a look at the new setcap directive to fix it.
> 
> Thanks to Tristan previous report, we know there is a real performance
> issue when using QUIC listener sockets. We have to investigate this. You
> have probably encounter an occurence of it. In the meantime, it's
> important to ensure you run with connection dedicated socket instead.
> 
> The socket fallback has been implemented silently first as on some
> platform it may not be supported at all. We plan to change this soon to
> report a log on the first permission error to improve the current
> situation.
> 
> Hope this helps,
> 
> --
> Amaury Denoyelle


Re: Stalled H3/QUIC Connections with Packet Loss

2023-09-22 Thread Tristan
Hi Luke,

> Under some conditions (especially high request rate), H3 connections simply 
> stall. This appears to be more prevalent in some browsers than others (Safari 
> on iOS is the worst offender), and appears to coincide with high packet_loss 
> in `show quic`.


Out of curiosity, what version of HAProxy are you running?

I don’t have a magic answer but in 
https://github.com/haproxy/haproxy/issues/2095#issuecomment-1570547484 we’ve 
been looking at performance issues with H3/QUIC over time, and there’s a couple 
of workarounds currently relevant (~ from the comment linked), notably to use a 
short timeout connect (currently using 500ms here), and to comment out timeout 
client-fin.

I see you explicitly set per-connection socket ownership, which is definitely 
the right thing to do, but you should check that it is actually working « for 
real » and not silently disabled due to permissions somewhere inside the worker 
processes.
To do that, have a look at the output of ‘show fd’ and check that H3 conns 
aren’t all with fd=-1 (instead it should be a different positive fd number per 
H3 conn)

I’d definitely like to see the JSFiddle if there’s a trick to the kind of 
request affected to see if I can reproduce it on our nodes.

Regards,
Tristan 

Re: WebTransport support/roadmap

2023-08-16 Thread Tristan
Looks like that's Websocket for udp/QUIC just because the Websocket 
Protocol does not work with QUIC, imho.


From a cursory read of 
https://github.com/w3c/webtransport/blob/main/explainer.md, it seems to 
have slightly different goals from traditional Websocket though.


Notably to sacrifice message ordering to get rid of head-of-line blocking.



Cite from https://datatracker.ietf.org/doc/html/draft-ietf-webtrans-http2/

```
By relying only on generic HTTP semantics, this protocol might allow 
deployment using any HTTP version. However, this document only defines 
negotiation for HTTP/2 [HTTP2] as the current most common TCP-based 
fallback to HTTP/3.

```


They do have a separate HTTP/3 specific draft here: 
https://datatracker.ietf.org/doc/html/draft-ietf-webtrans-http3


When I look back how a nightmare the  Websocket in the different version 
  was to implement it will this variant for QUIC not be much easier, 
from my point of view.


Now as for complexity, I can only agree that I hope it will be simpler 
for everyone, implementers as well as users... since easiness of use 
really was not a strong point of WS...


Tristan



Re: problem with automatic OCSP update -- getting ipv6 address for ocsp endpoint

2023-08-15 Thread Tristan
Hi Shawn,

See the note at the end of 
http://docs.haproxy.org/2.8/configuration.html#5.1-ocsp-update

Specifically:

> A common error that can happen with let's encrypt certificates is if the DNS
resolution provides an IPv6 address and your system does not have a valid
outgoing IPv6 route. In such a case, you can either create the appropriate
route or set the "httpclient.resolvers.prefer ipv4" option in the global
section.
Hopefully fixes your issue, as it did when I had the same issue.

Tristan

ps: I hope this message doesn't get sent as html, since I'm writing from my 
phone... apologies if it ends up doing so

> On 16 Aug 2023, at 01:37, Shawn Heisey  wrote:
> 
> I've got another haproxy install on which I am trying to enable automatic 
> OCSP updating.  The ones I asked about before are personal, this one is for 
> work.
> 
> When haproxy looks up the host where it can get OCSP responses, it is getting 
> an ipv6 address.
> 
> Aug 15 18:27:30 - haproxy[11234] -:- [15/Aug/2023:18:27:30.103]  
> /etc/ssl/certs/local/imat_us.wildcards.combined.pem 2 "HTTP error" 1 0
> Aug 15 18:27:30 - haproxy[11234] -:- [15/Aug/2023:18:27:30.104]  
> -/- 48/0/-1/-1/46 503 217 - - SC-- 0/0/0/0/3 0/0 
> {2600:1405:7400:13::17de:1b94} "GET 
> http://r3.o.lencr.org/MFMwUTBPME0wSzAJBgUrDgMCGgUABBRI2smg%2ByvTLU%2Fw3mjS9We3NfmzxAQUFC6zF7dYVsuuUAlA5h%2BvnYsUwsYCEgRA%2BzJf7gt%2BI21Isq6Sy8pDxg%3D%3D
>  HTTP/1.1"
> 
> If I try the URL in the second log line with curl, I get the proper response. 
>  The curl program is getting an ipv4 address.
> 
> I thought it might be doing this because the machine did have an ipv6 local 
> link address, so I completely disabled ipv6 with the grub commandline and 
> rebooted.  Now there is no ipv6 address, but haproxy is still getting an ipv6 
> address for r3.o.lencr.org.
> 
> I couldn't locate any config for haproxy that would disable ipv6.  Is there a 
> way to fix this problem?
> 
> HAProxy version 2.8.2 2023/08/09 - https://haproxy.org/
> Status: long-term supported branch - will stop receiving fixes around Q2 2028.
> Known bugs: http://www.haproxy.org/bugs/bugs-2.8.2.html
> Running on: Linux 4.18.0-477.21.1.el8_8.x86_64 #1 SMP Thu Aug 10 13:51:50 EDT 
> 2023 x86_64
> Build options :
>  TARGET  = linux-glibc
>  CPU = native
>  CC  = cc
>  CFLAGS  = -O2 -march=native -g -Wall -Wextra -Wundef 
> -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits 
> -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond 
> -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label 
> -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered 
> -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int 
> -Wno-atomic-alignment
>  OPTIONS = USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_QUIC=1 USE_PCRE2_JIT=1
>  DEBUG   =
> 
> Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY 
> +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE 
> -LIBATOMIC +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY -LUA -MATH -MEMORY_PROFILING 
> +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 
> +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION +QUIC 
> +RT +SHM_OPEN -SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD 
> +THREAD_DUMP +TPROXY -WURFL +ZLIB
> 
> Default settings :
>  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
> 
> Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, 
> default=4).
> Built with OpenSSL version : OpenSSL 3.1.2+quic 1 Aug 2023
> Running on OpenSSL version : OpenSSL 3.1.2+quic 1 Aug 2023
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
> OpenSSL providers loaded : default
> Built with network namespace support.
> Built with zlib version : 1.2.11
> Running on zlib version : 1.2.11
> Compression algorithms supported : identity("identity"), deflate("deflate"), 
> raw-deflate("deflate"), gzip("gzip")
> Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
> IP_FREEBIND
> Built with PCRE2 version : 10.32 2018-09-10
> PCRE2 library supports JIT : yes
> Encrypted password support via crypt(3): yes
> Built with gcc compiler version 8.5.0 20210514 (Red Hat 8.5.0-18)
> 
> Available polling systems :
>  epoll : pref=300,  test result OK
>   poll : pref=200,  test result OK
> select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
> 
> Available multiplexer protocols :
> (protocols marked as  cannot be specified using 'proto' keyword)
>   quic : mode=HTTP  side=FE mux=QU

Re: OCSP update mechanism startup

2023-07-07 Thread Tristan



On 07/07/2023 16:34, Willy Tarreau wrote:

On Fri, Jul 07, 2023 at 03:06:52PM +, Tristan wrote:

The ocsp-update option should be between brackets
/etc/haproxy/ssl/mangadex.dev.pem [ocsp-update on] mangadex.dev
*.mangadex.dev


Oh that makes more sense indeed; should have guessed so since other crt-list
bind params used those indeed...


I think that we should at least emit a warning when domain names not
containing any dot are found, because IIRC these are not valid for
SNI anyway thus there should be no valid reason that an option name
is accepted as a server name.


As an aside, I found the doc relating to preserving bind's strict-sni 
behaviour to be a little bit unclear.


Indeed, I first tried just keeping strict-sni on the bind line, with no 
explicit SNIs filters specified on certificate lines, and that caused 
SSL failures.


And it wasn't serving a default cert but rather just failing the SSL conns.

My uneducated guess is that crt-list just doesn't take into account the 
bind line's strict-sni in such cases, or doesn't extract CNs/SANs. But 
the doc explicitly says that CNs/SANs are taken into account, so I 
assume the former. That's just my uneducated guess anyway.


After a bit of trial-and-error I found out that a working combination was:
1. strict-sni on the bind line
2. explicit sni filters on cert lines in the crt-list

Tho I'd bet that in this case only 2. does anything, and 1. isn't really 
doing anything anymore.


Also personally I have never understood the point of default server 
certs... besides getting unwanted attention from censys/shodan/etc...



Hmm, I understand why it was decided to go that route, and indeed that is
probably not too too hard to do... Though I can't help wonder if an opt-in
mechanism similar to the server state file would make sense for it?


I seem to remember that it's possible to issue a "show ocsp" on the CLI
and direct it to a file, but I could be wrong. That's the same way the
server-state works by the way.


I'll take a look at it, though the main advantage would be to keep it 
all in HAProxy and to avoid dirty cron jobs. But I appreciate that's me 
asking for HAProxy to do everything and more beyond its job... :)


Tristan



Re: OCSP update mechanism startup

2023-07-07 Thread Tristan

The ocsp-update option should be between brackets
/etc/haproxy/ssl/mangadex.dev.pem [ocsp-update on] mangadex.dev 
*.mangadex.dev


Oh that makes more sense indeed; should have guessed so since other 
crt-list bind params used those indeed...



- does the OCSP update mechanism update the files on-disk?


No we never write anything on disk.

[...]
- if not, what happens if upon restart/reload of HAProxy the .ocsp 
file is outdated? will HAProxy aggressively try to get an up-to-date 
response before starting its listeners or will I be risking ssl issues 
by enabling OCSP must-staple with it?


The OCSP update mechanism will not block anything, it runs alongside all 
the "regular" HAProxy tasks.
If I remember correctly, you cannot load outdated OCSP responses so you 
should not face this particular problem. But if you have many 
certificates for which OCSP update was activated and no OCSP response 
was provided, fetching all the missing responses will indeed take some 
time and OCSP stapling will temporarily fail for the given server 
certificates.


Hmm, I understand why it was decided to go that route, and indeed that 
is probably not too too hard to do... Though I can't help wonder if an 
opt-in mechanism similar to the server state file would make sense for it?


Either way, it all worked flawlessly within these constraints after 
putting the brackets, so that's great. Thanks.


Tristan



Re: QUIC (mostly) working on top of unpatched OpenSSL

2023-07-06 Thread Tristan

Hi Willy,

Thanks for sharing that. First, I'm amazed that such a hacky method 
works well-enough to get QUIC (nearly-fully) working.


Now for your concerns... Honestly, I agree with you and really don't 
want to see a brand new protocol compromised on.


Whether one calls it "ossification" or "degraded", it would be a 
compromise on a spec that hasn't even reached mass adoption yet, because 
of one single stubborn library's committee.


So many hours of our lives (in IT) are wasted on dealing with poor 
decisions from the past, leading to inconsistent specs, surprising 
behaviours, and security holes left and right.
And these *always* have some obscure justification like "but at the 
time, popular software $XYZ wasn't compliant/nice/$whatever, so everyone 
was forced to give in, and this is why the spec is now crap".


I've spent far too long complaining and cursing about these things to 
not feel strongly against it as it unfolds in my own time for once, 
instead of feeling helpless because it happened 25 years ago.


Now that said, I'd understand if HAProxy went forward with it, even if 
that feels bad.

I think it'd be a smart thing for the project to do.
Not for technical reasons, but from a "marketing" standpoint (and I 
don't mean the word in a negative way). If other similar projects adopt 
it, then HAProxy would be the only one missing it, and be perceived as 
lacking a feature from a prospective user's point of view. After all, 
people *will* keep comparing HAProxy with nginx & friends. And they 
won't really care about the hacks that were necessary to get there.


So while it'd be sad, and I'd much much prefer seeing wolfSSL become the 
new standard instead, it would not be that unreasonable for HAProxy to 
adopt the patch either...


Between a rock and a hard place,
Tristan



OCSP update mechanism startup

2023-07-06 Thread Tristan

Hello,

I'm trying to make use of the new ocsp-update mechanism, and finding no 
success (yet).


I've migrated my crt bind arguments to a crt-list argument (+ relevant 
file) and that loads in and gets used fine, but despite having 
"ocsp-update on" on the lines, I'm not seeing any ocsp update being 
triggered (and naturally there are no ocsp responses being returned).


Relevant outputs:

$ echo "show ssl crt-list" | socat - /var/run/haproxy/admin.sock
/etc/haproxy/ssl/crtlist-mdex-public-direct.conf
...

$ echo "show ssl crt-list 
/etc/haproxy/ssl/crtlist-mdex-public-direct.conf" | socat - 
/var/run/haproxy/admin.sock

/etc/haproxy/ssl/mangadex.dev.pem ocsp-update on mangadex.dev *.mangadex.dev

$ echo "show ssl ocsp-updates" | socat - /var/run/haproxy/admin.sock
OCSP Certid | Path | Next Update | Last Update | Successes | Failures | 
Last Update Status | Last Update Status (str)


One potential hint I find is when using "update ssl ocsp-response 
":
$ echo "update ssl ocsp-response /etc/haproxy/ssl/mangadex.dev.pem" | 
socat - /var/run/haproxy/admin.sock
'update ssl ocsp-response' only works on certificates that already have 
a known OCSP response.

Can't send ocsp request for /etc/haproxy/ssl/mangadex.dev.pem!

And indeed, the (cli command) docs mention:
> This command will only work for certificates
that already had a stored OCSP response, either because it was provided
during init or if it was previously set through the "set ssl cert" or "set
ssl ocsp-response" commands

So then:
- does the ocsp-update feature also require an initial OCSP response to 
start at all, like the cli command?

- if so, why? (and we should get that documented on the ocsp-update flag)
- does the OCSP update mechanism update the files on-disk?
- if not, what happens if upon restart/reload of HAProxy the .ocsp file 
is outdated? will HAProxy aggressively try to get an up-to-date response 
before starting its listeners or will I be risking ssl issues by 
enabling OCSP must-staple with it?


(Probably not super relevant, but this is with HAProxy 2.8.1 and QuicTLS 
1.1.1t)


Thanks in advance,
Tristan



Re: [ANNOUNCE] haproxy-2.8.0

2023-05-31 Thread Tristan

Congratulations to the team at large for the release!
There's definitely been more improvements and fixes than meets the eye 
from the release notes alone!


On 31/05/2023 16:14, Willy Tarreau wrote:


- QUIC: it has been running almost flawlessly for a year on haproxy.org,
   and totally flawlessly over the last 6 months [...] >we now consider it 
production-ready [...]
All I can say is that this was a long time coming since the early days 
of 2.6.0, and that the amount of bug fixes and improvements Amaury and 
Fred provided to make this happen has been nothing short of amazing.


A big thank you to them for their patience with my reports and not 
giving up on even the most obscure issues!


As a parting gift, I don't have a CSS nit to report like Tim, but I do 
have this prophetic comment from Willy almost a year ago on the matter: 
https://github.com/haproxy/haproxy/issues/1808#issuecomment-1230660870 :)


Tristan



Re: haproxy -dKcnv output

2023-05-31 Thread Tristan


Is it? In all the programming languages I use, the colon is followed by 
the return type (which for iif is str). 


my claim of mainstream-ness, was mainly meaning the ": in => out" order 
(one example would be most ML languages, Typescript, Java...) as opposed 
to ": out <= in" which I haven't seen being used so far


Of course the '=>' also implies 
return type and now I'm seeing two return types there.



[...]


I think it's pretty clear if you look at the usage in a config. Example 
from my production config:


     ssl_fc,iif(https,http)

That works like the pipeline operator in shell or various programming 
languages. Effectively the 'ssl_fc' result acts as the first parameter, 
it's just written a little differently.


I think we're just seeing it in a different way

I see converters (and shell pipe operations for that matter) as 
functions being chained, so their individual types must be those of a 
function (ie: in => out) and cannot be "str", since that is the type of 
a value


Running with this on the example of shell pipes, where the fact that 
things usually also support these forms:


tool [...args] file
tool [...args] -

is interpretable as a convenience parenthesis-less form of:
(tool [...args)(file)
(tool [...args])(-)

And the left-side portion of the pipe (not sure if that has a precise 
name?) is not an implicit first argument to the right side, but rather 
the only argument of a nameless function defined on the right side


In that view then

> var x = iif("foo","bar")
here x is a a function, of type bool => str

> var y = x(true)
and y is the result of giving a bool to a bool => str, and thus an str

> ssl_fc,iif(https,http)
is then pretty much akin to iif(https,http)(ssl_fc)

and that is why iif(str,str) being "bool => str" seems sensible to me

Now one can be pedantic and then argue to write it like this:
iif: (str,str) => (bool => str)

which is what many ML languages go with, and has some elegance, but is 
terrible for readability and that I wouldn't defend



     iif(str,str): bool => str

hides the type of the "important" parameter in the middle (and has the 
colon = return type problem).


That's a fair point admittedly; maybe I've just grown too used to that 
downside




     iif(str,str): str <= bool

at least has the parameter on an outer position, but the order is still 
reversed from the config.


fwiw while I said it seems to be an unusual notation order, it does have 
the benefit of being clear, indeed




     iif(bool,str,str): str

is reasonably similar to the config, but not identical


The similarity is specifically what I dislike about it, as it does seem 
a lot like this


http-request set-var(txn.proto) iif(ssl_fc,https,http)

should be a valid config line
or at least I think that wouldn't be entirely unreasonable to think that 
at first glance




     bool,iif(str,str): str

which I didn't mention before might possibly be the best choice?


I do quite like that

Best regards, and thanks for the bike-shedding exercise :D
Tristan



Re: haproxy -dKcnv output

2023-05-31 Thread Tristan
If fetches already have the output type after the colon, then the 
converter should not have the input type after the colon, i.e.


     iif(str,str): bool => str

is confusing, because it looks like it returns a bool, ... I guess?


While this is mainly a feelings thing, I'd say that it is more 
widespread (if not nowadays near-mainstream? not that this is 
necessarily a good argument of course) to write it that way




     iif(str,str): str <= bool



Though I guess this one is clear enough, if unusual (to me anyway)

Another option might be listing the "implicit" parameter as a real 
parameter:


     iif(bool,str,str): str


This one feels quite unnatural to me; unless you're familiar with the 
underlying implementation of many OOP systems and see this as some kind 
of extension method definition, the fact that the first parameter is 
implicitly passed seems (to me) like it's coming out of nowhere


Tristan



Re: Drain L4 host that fronts a L7 cluster

2023-05-05 Thread Tristan

however, our reason to migrate to HAproxy is adding gRPC
compliance to the stack, so H2 support is a must. Thanks for the
workarounds, indeed interesting, I'll check them out.


From a cursory look at the gRPC spec it seems like you would indeed 
really need the GOAWAY to get anywhere



trigger the GOAWAY H2 frames (which isn't possible at

the moment, as far as I can tell)

*Is this a valid feature request for HAProxy?*
Maybe, we can provide "setting CLO mode" via the "http-request" directive?


I can't make that call, but at least it sounds quite useful to me indeed.

And in particular, being able to set CLO mode is likely a little bit 
nicer in the long run than something like a hypothetical 'http-request 
send-h2-goaway', since CLO mode can account for future protocols or spec 
changes transparently as those eventually get added to HAProxy.


Interesting problem either way!

Cheers,
Tristan



Re: Drain L4 host that fronts a L7 cluster

2023-05-05 Thread Tristan

Hi Abhijeet,


Problem statement is, how do you drain a node [...] L7 constructs like "Connection: 
close" or "GOAWAY
h2 frames" 
[...] > * For any map (L4 client IP lookup) based solution, I was unable to

find any http-request operation that sets "drain mode".


Indeed the managed drain mode is an all-or-nothing at the moment. The 
"only" missing feature to fully replicate it with http-request rules 
would be a way to trigger the GOAWAY H2 frames (which isn't possible at 
the moment, as far as I can tell).


But in the meantime, the following might do the job:

frontend foo
  # tcp-request connection sees the "real" src in accept-proxy 
listeners, according to 7.3.3#src

  tcp-request connection set-var(txn.src_l4_ip) src

  # assuming a stick-table here, but should work just fine with 
maps/lists too
  http-after-response set-header Connection close if { 
var(txn.src_l4_ip),in_table(draining-clients) }


At that point we have the HTTP/1.1 part handled but that was easy 
enough... Now for H2 there's no good way that I can see.


But I can think of a few hacky solutions that might work (ie would need 
to test them in practice):
1. Returning an HTTP 421, which H2 clients should interpret as being 
instructed to use a different connection to issue the request, which 
should also be automatically retried in browsers
2. Adding an Alt-Svc to those responses advertising http/1.1 explicitly, 
which clients should pick up and prefer (then they either land on 
another LB while establishing the new connection, or get Connection-close'd)
3. Returning an HTTP 426, instructing the client to "upgrade" to 
http/1.1 (not sure how that interacts with ALPN but the spec seems to 
suggest that the client would give the upgrade request priority), which 
ends up with the same outcome as the Alt-Svc solution if it works


Of course it'd be many times better to be able to just trigger a GOAWAY 
instead, but it was fun to think about more "creative" options, so here 
they are.


Tristan



Re: gitlab server behind haproxy never switches to http/3

2023-04-27 Thread Tristan
I did figure out that ufw was not allowing udp/443.  So it turns out 
that wasn't working for any website on that install.


Ah yep, the fact that allow 443 implies allow tcp/443 and you need to 
explicitly allow udp/443 is yet another "quirky" thing (to be polite...)


But even after allowing udp/443, it is still not switching.  If I enter 
the URL into https://http3check.net/ it does say that http3 works.


Yup that sounds about right

Hmm.  The web server is on the local gigabit LAN with the client.  Would 
that give TCP a significant enough boost that it could beat UDP?


Hard to say; in our case it seemed more random than actually driven by 
anything remotely close to clear. I'm merely quoting official word on 
it, but have yet to try and dig the actual source code for how it 
happens. The only platform I know gets reliable QUIC enabling is CF, so 
maybe someone from their side could chime in if they ever see this.


I have chrome, not chromium.  Substituting /opt/google/chrome/chrome for 
chromium and running it with those options, it DOES do http/3.  The 
lightning bolt is orange instead of blue.


Then yep, you're in the same boat as we were. It switched for no reason 
one day. Even trying HTTPS/SVCB DNS records did nothing for us until it 
"magically" decided to use H3.


Tristan



Re: gitlab server behind haproxy never switches to http/3

2023-04-27 Thread Tristan
One of the websites I have behind it is my gitlab server.  That is 
always http/2, it never switches to http/3.


Does anyone know why that happens, and whether there is anything I can 
do about it?  The alt-svc header IS received by the browser.


Browsers unfortunately do not give much information in such cases. Your 
best bet is going to {chrome, edge}://net-export/ and recording a session.


Sometimes the browser will be kind enough to acknowledge it with a 
reason there, but usually it won't (which is quite infuriating, if 
someone working with browser vendors reads this).


Similarly to you, on our site we had the homepage (ie /) refuse to ever 
use HTTP3 for the longest time, while all other pages would. Then it 
suddenly worked. And then it didn't anymore. etc. Never knew why, never 
showed up in net-export nor anywhere else for that matter.


As far as I know, the main way it happens is that the browser:
- races H2 and H3 and picks the fastest (then remember it)
- retries on H2 in case of H3 issue (then remember it)

And how long they remember is not document visibly anywhere that I could 
find either (except when it shows up in the net-export, then it says 
until when it won't try again, but most of the time it doesn't show up, 
again).


You can try something like that to force it to use H3 and reveal 
whatever issue it might be having:
chromium-browser --enable-quic 
--origin-to-force-quic-on=your-gitlab-host.com:443


And see if anything's broken which would explain why your browser then 
refuses to ever use H3 with it.


If you see something then best is to raise an issue; my guess is that 
Amaury or Fred will want traces to be able to dig into it (like so 
https://github.com/haproxy/haproxy/issues/2004#issuecomment-1398607279) 
but that's for them to request if and when they judge it necessary.


Tristan

PS: I see you already do, but for others reading this, you definitely 
want 2.7.7 or a 2.8 base after 7b516d3732aab5999fc8f4a3037a4317d35d3618 
if you're playing with H3/QUIC. The recent few commits make a 
significant difference in compatibility.




regression test suite for haproxy

2013-11-26 Thread Tristan Delhalle
Hi,

I am packaging haproxy for distribution and want to pass all non-regression 
test.
I do not find a fancy user test suite like a make check and so on.
Is there guideline for how executing test in tests/ directory (*.cfg,*.c) ?

Thanks for reply