[PATCH 0/1] enabling ssl keylog for LibreSSL 3.5.0

2023-05-23 Thread Ilya Shipitsin
found during QUIC Interop for LibreSSL

Ilya Shipitsin (1):
  BUILD: SSL: enable TLS key material logging if built with
LibreSSL>=3.5.0

 include/haproxy/openssl-compat.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.40.1




[PATCH 1/1] BUILD: SSL: enable TLS key material logging if built with LibreSSL>=3.5.0

2023-05-23 Thread Ilya Shipitsin
LibreSSL implements TLS key material since 3.5.0, let's enable it
---
 include/haproxy/openssl-compat.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/haproxy/openssl-compat.h b/include/haproxy/openssl-compat.h
index 7fb153810..ed162031c 100644
--- a/include/haproxy/openssl-compat.h
+++ b/include/haproxy/openssl-compat.h
@@ -88,7 +88,8 @@
 #define HAVE_SSL_SCTL
 #endif
 
-#if (HA_OPENSSL_VERSION_NUMBER >= 0x10101000L)
+/* minimum OpenSSL 1.1.1 & libreSSL 3.5.0 */
+#if (defined(LIBRESSL_VERSION_NUMBER) && (LIBRESSL_VERSION_NUMBER >= 
0x305fL)) || (HA_OPENSSL_VERSION_NUMBER >= 0x10101000L)
 #define HAVE_SSL_KEYLOG
 #endif
 
-- 
2.40.1




Re: [PATCH] re-enable EVP_chacha20_poly1305() for LibreSSL

2023-05-23 Thread Willy Tarreau
On Tue, May 23, 2023 at 04:57:05PM +0200, Willy Tarreau wrote:
> Hi Ilya,
> 
> On Sun, May 21, 2023 at 12:57:21PM +0200,  ??? wrote:
> > Hello,
> > 
> > that exclude was only needed for pre-3.6.0 LibreSSL, while support was
> > added in
> > 3.6.0, so every released LibreSSL supports that, no need to keep "ifdef"
> 
> While I'm probably not the one who will be the best to review this, you
> forgot to attach the patch :-)  (for once it's not me).

Now merged, thank you!

Willy



Fwd: couple of questions on QUIC Interop

2023-05-23 Thread Frederic Lecaille
Forgot to reply to all!


 Forwarded Message 
Subject: Re: [EXTERNAL] couple of questions on QUIC Interop
Date: Tue, 23 May 2023 17:12:26 +0200
From: Frederic Lecaille 
To: Илья Шипицин 

On 5/22/23 12:00, Илья Шипицин wrote:
> Hello,

Hello,

> I played with QUIC Interop suite (for HAProxy + LibreSSL) on weekend...
> 
> couple of questions
> 
> 1) particular patch haproxy-qns/0001-Add-timestamps-to-stderr-sink.patch
> at master · haproxytech/haproxy-qns (github.com)
> 
>  not included into haproxy upstream, no good reason ?

There is a good reason. During the tests haproxy must write its traces
to a file which is automatically exported by the interop runner to a
website. This patch is there to configure the sink (stderr) used by the
QUIC trace module (add timestamps). This is something which was not
possible without patching haproxy. I am not sure this is still the case.
Anyway, as this is specific to the interop tests environment, we do not
care.

> 2) why "quic-dev" repo is used, not primary haproxy upsteam
> ? haproxy-qns/Dockerfile at master · haproxytech/haproxy-qns
> (github.com)
> 

The interop runner is used to test haproxy quic-dev repo sources before
they are merged into haproxy upstream.




Re: [PATCH] re-enable EVP_chacha20_poly1305() for LibreSSL

2023-05-23 Thread Илья Шипицин
also, there'll be a patch for unlocking  haproxy/openssl-compat.h at master
· haproxy/haproxy · GitHub

for
LibreSSL soon
(it was too boring to run QUIC Interop without keylog)

вт, 23 мая 2023 г. в 17:06, Илья Шипицин :

> oops.
>
> btw, not enabling chacha20_poly1305 leads to LibreSSL api usage
> incostistance
> QUIC regression on LibreSSL-3.7.2 (HAProxy) · Issue #860 ·
> libressl/portable (github.com)
> 
>
> it is claimed that OpenSSL does not check for null deref as well, so
> LibreSSL just mimics it :)
> joke.
>
> вт, 23 мая 2023 г. в 16:57, Willy Tarreau :
>
>> Hi Ilya,
>>
>> On Sun, May 21, 2023 at 12:57:21PM +0200,  ??? wrote:
>> > Hello,
>> >
>> > that exclude was only needed for pre-3.6.0 LibreSSL, while support was
>> > added in
>> > 3.6.0, so every released LibreSSL supports that, no need to keep "ifdef"
>>
>> While I'm probably not the one who will be the best to review this, you
>> forgot to attach the patch :-)  (for once it's not me).
>>
>> Willy
>>
>


Re: [PATCH] DOC/MINOR: config: Fix typo in description for `ssl_bc` in configuration.txt

2023-05-23 Thread Willy Tarreau
On Mon, May 22, 2023 at 01:11:13PM -0500, Mariam John wrote:
> From: Mariam John 
> 
> Fix a minor typo in the description of the `ssl_bc` sample fetch method 
> described under
> Section `7.3.4. Fetching samples at Layer 5` in configuration.txt. Changed 
> `other` to `to`.

Good catch, now applied.

Thanks!
Willy



Re: [PATCH] re-enable EVP_chacha20_poly1305() for LibreSSL

2023-05-23 Thread Илья Шипицин
oops.

btw, not enabling chacha20_poly1305 leads to LibreSSL api usage
incostistance
QUIC regression on LibreSSL-3.7.2 (HAProxy) · Issue #860 ·
libressl/portable (github.com)


it is claimed that OpenSSL does not check for null deref as well, so
LibreSSL just mimics it :)
joke.

вт, 23 мая 2023 г. в 16:57, Willy Tarreau :

> Hi Ilya,
>
> On Sun, May 21, 2023 at 12:57:21PM +0200,  ??? wrote:
> > Hello,
> >
> > that exclude was only needed for pre-3.6.0 LibreSSL, while support was
> > added in
> > 3.6.0, so every released LibreSSL supports that, no need to keep "ifdef"
>
> While I'm probably not the one who will be the best to review this, you
> forgot to attach the patch :-)  (for once it's not me).
>
> Willy
>
From 4c2a848a9e9eb244244082c29cdcd5eebddbf9c5 Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Sun, 21 May 2023 12:51:46 +0200
Subject: [PATCH 1/3] BUILD: chacha20_poly1305 for libressl

this reverts d2be9d4c48b71b2132938dbfac36142cc7b8f7c4

LibreSSL implements EVP_chacha20_poly1305() with EVP_CIPHER for every
released version starting with 3.6.0
---
 include/haproxy/quic_tls.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/haproxy/quic_tls.h b/include/haproxy/quic_tls.h
index a2eb2230a..35efbb91d 100644
--- a/include/haproxy/quic_tls.h
+++ b/include/haproxy/quic_tls.h
@@ -118,10 +118,8 @@ static inline const EVP_CIPHER *tls_aead(const SSL_CIPHER 
*cipher)
return EVP_aes_128_gcm();
case TLS1_3_CK_AES_256_GCM_SHA384:
return EVP_aes_256_gcm();
-#if !defined(LIBRESSL_VERSION_NUMBER)
case TLS1_3_CK_CHACHA20_POLY1305_SHA256:
return EVP_chacha20_poly1305();
-#endif
 #ifndef USE_OPENSSL_WOLFSSL
case TLS1_3_CK_AES_128_CCM_SHA256:
return EVP_aes_128_ccm();
-- 
2.39.2.windows.1



Re: [PATCH] re-enable EVP_chacha20_poly1305() for LibreSSL

2023-05-23 Thread Willy Tarreau
Hi Ilya,

On Sun, May 21, 2023 at 12:57:21PM +0200,  ??? wrote:
> Hello,
> 
> that exclude was only needed for pre-3.6.0 LibreSSL, while support was
> added in
> 3.6.0, so every released LibreSSL supports that, no need to keep "ifdef"

While I'm probably not the one who will be the best to review this, you
forgot to attach the patch :-)  (for once it's not me).

Willy



Re: Drain L4 host that fronts a L7 cluster

2023-05-23 Thread Willy Tarreau
Hi Abhijeet,

On Mon, May 22, 2023 at 12:30:52PM -0700, Abhijeet Rastogi wrote:
> Hi Willy,
> 
> Thank you for the response. It's great to know that this might be
> considered as a feature request in future versions, pending
> prioritization though.
> 
> Could you comment on why this isn't already a feature yet?

Huh ? It's not easy to comment on an existing state. It's the combination
of what people needed, what is already available and what can cause trouble.

> It is hard
> to believe that we're the first to come across this draining problem
> when using a chain of "L4 to L7 proxies". Are there any alternative
> approaches that we should look at, to get around the current
> limitations?

Others are likely doing this using a regular soft-stop. During the
soft-stop, there will be a combination of active and passive closures
on idle connections (you can even decide over how long a time you want
to close them all so that you don't suddenly close too many of those
such as to avoid an inrush of traffic on other nodes). So the "normal"
way to stop traffic to an L7 node in an L4+L7 setup is:

  1) disable sending new connections to the L7 node (set its weight
 to zero or make it fail a health check for example)

  2) send the soft-stop signal to haproxy

  3) wait for the process to be gone after the last connection is
 closed

If you're dealing with a reload, the old process automatically passes
through the soft-stop phase and does this without having to fiddle with
a front L4 LB.

The case where users would like to close H2 connections actually is more
when they want some connections to re-establish on another node without
reloading the first one. Typically when moving a small portion of the
traffic on a node having a new config just to test it. In such cases it
would indeed be desirable to make it possible to close connections
earlier. But doing so is not without consequences. For example, if you
close immediately after finishing the last stream, you could make your
local TCP stack send RSTs due to the client sending WINDOW_UPDATEs to
acknowledge receipt of the data, because TCP doesn't know that
WINDOW_UPDATES are just ACKs and do not convey useful data that could
be drained like empty ACKs. I do have some plans to work around this
protocol deficiency that's already in an old issue (I think it's #5 but
not sure) which would consist in sending an ACKed frame after the GOAWAY
(or in any case before closing), so that we know when the peer received
the last frame. This could be a PING or a SETTINGS frame (more interesting
as we could advertise max_stream=0 there). But it means adding a new state
to the existing state machine and validating that we don't break existing
implementations for example.

Nothing is impossible and I would really like to have a way to gracefully
close connections, at least because there are situations where it's
desirable. But it must not be seen as a one-size-fits-all solution. In
you case, if it's only for taking off a node from an L4 LB farm, I can't
see any limitation to the existing solution.

Regards,
Willy



Re: maint, drain: the right approach

2023-05-23 Thread Willy Tarreau
On Tue, May 23, 2023 at 11:21:28AM +0200, Thomas Pedoussaut wrote:
> 
> On 23/05/2023 11:14, Matteo Piva wrote:
> > Seems that it's considered an expected behavior to consider
> > optimistically the server as UP
> > when leaving MAINT mode, even if the L4 health checks are not completed yet.

To be more precise, at boot time, servers start with one last point of
health. This makes sure they get a definitive verdict with the very first
health check. In parallel, maintenance stops health checks, so if you put
a server in maintenance before its first check, my take is that its check
is still in the same state so that when you leave maintenance, it is still
up with one last check to be performed. I could of course be proven wrong,
but that's what I have in mind with the operational status and the
administrative state which are two independent things.

> > I consider that a quite annoying feature, but maybe I'm approaching at this 
> > in a wrong way.
> > 
> > Waiting for others to comment such issue to better understand.
> I have the exact same issue. I'm inserting servers on the fly from info I
> get from my orchestration backend (AWS ECS), and they fail the first few
> requests before L7 checks flag them down.

I understand this, and others have the opposite case in fact, i.e. with
slow checks, they don't want to take ages to reinsert a previously disabled
server in the farm. Normally using the existing API you could forcefully
mark the server's check as down using this before leaving maintenance:

   set server / health [ up | stopping | down ]

Doesn't it work to force it to down before leaving maintenance and wait
for it to succeed its checks ? That would give this to leave maintenance:

   set server blah health down; set server blah state ready

By the way that reminds me that a long time ago we envisioned a server
option such as "init-state down" but with the ability to add servers on
the fly via the CLI it seemed a bit moot precisely because you should be
able to do the above. But again, do not hesitate to tell me if I'm wrong
somewhere, my goal is not to reject any change but to make sure we're not
trying to do something that's already possible (and possibly not obvious,
I concede).

Willy



Re: maint, drain: the right approach

2023-05-23 Thread Thomas Pedoussaut



On 23/05/2023 11:14, Matteo Piva wrote:
Seems that it's considered an expected behavior to consider 
optimistically the server as UP

when leaving MAINT mode, even if the L4 health checks are not completed yet.

I consider that a quite annoying feature, but maybe I'm approaching at this in 
a wrong way.

Waiting for others to comment such issue to better understand.
I have the exact same issue. I'm inserting servers on the fly from info 
I get from my orchestration backend (AWS ECS), and they fail the first 
few requests before L7 checks flag them down.




Re: maint, drain: the right approach

2023-05-23 Thread Matteo Piva


> Hi Matteo, 

Hi Aurelien, thanks for your reply on my issue


> > Once the activity on the underlying service has been completed and they 
> > are starting up, I switch back from MAINT to READY (without waiting the 
> > service to be really up). 
> > The haproxy backend got immediately back in the roundrobin pool, even if 
> > the L4 and L7 checks are still validating that the underlying service is 
> > still DOWN (service is still starting, could take time). 

> I would wait for others to confirm or infirm, but meanwhile the 
> situation you described makes me think of an issue that was revived on 
> Github a few weeks ago: https://github.com/haproxy/haproxy/issues/51 
> (server assumed UP when leaving MAINT before the checks are performed) 


yes, I saw that issue on Github, but I can also see the lukastribus 's comment 
on that: 

  When health checks are used, but did not start yet or the status is not yet 
determined, 
  then the current server status will be UP. This is documented and expected 
behavior. 


Seems that it's considered an expected behavior to consider optimistically the 
server as UP
when leaving MAINT mode, even if the L4 health checks are not completed yet.

I consider that a quite annoying feature, but maybe I'm approaching at this in 
a wrong way.

Waiting for others to comment such issue to better understand.


Thanks, 
Matteo 



Re: maint, drain: the right approach

2023-05-23 Thread Aurelien DARRAGON
Hi Matteo,

> Once the activity on the underlying service has been completed and they
> are starting up, I switch back from MAINT to READY (without waiting the
> service to be really up).
> The haproxy backend got immediately back in the roundrobin pool, even if
> the L4 and L7 checks are still validating that the underlying service is
> still DOWN (service is still starting, could take time).

I would wait for others to confirm or infirm, but meanwhile the
situation you described makes me think of an issue that was revived on
Github a few weeks ago: https://github.com/haproxy/haproxy/issues/51
(server assumed UP when leaving MAINT before the checks are performed)


Regards,
Aurelien



Re: maint, drain: the right approach

2023-05-23 Thread Matteo Piva
Hi all, 

still trying to figure out the right way to to this. 

Any suggestions to share with me? 


Thanks, 
Matteo 

- Messaggio originale -


Da: "Matteo Piva"  
A: "HAProxy"  
Inviato: Giovedì, 11 maggio 2023 11:04:11 
Oggetto: maint, drain: the right approach 

Hi, 

I'm trying to get into the right maintenance procedure when I have to put down 
an HTTP backend for maintenance. 

When I put one of the two backends in MAINT mode (disabled), the traffic is 
then immediately routed only to the active backend. And this includes 
persistent connections as well. 

Once the activity on the underlying service has been completed and they are 
starting up, I switch back from MAINT to READY (without waiting the service to 
be really up). 
The haproxy backend got immediately back in the roundrobin pool, even if the L4 
and L7 checks are still validating that the underlying service is still DOWN 
(service is still starting, could take time). 

So we have: 

- HAPROXY backend: ready->maint 
- Stopping service 
- Starting service 
- HAPROXY backend: maint->ready 
>From now we have the backend in the pool, haproxy is still checking if the 
>service is UP or DOWN (L4) - We have http/503s calling the frontend 
- HAPROXY backend: down (checked, L4) 
- Service up 
- HAPROXY backend: up (checked, L4/L7) 

During the window between maint->ready and L4 check DOWN, the clients got 
http/503 response when being routed to the starting backend. 



Now, I know that using DRAIN the L4/L7 checks can still be ongoing, and this 
can solve such issue. 
But this also means that I can't avoid persistent connections to be routed to 
this backend, so I could have http/503s during the maintenance window. 

Which is the right approach? 
Is there a way to let maint->ready transition to be pessimistic, and wait for 
the checks to complete before the backend be back in the pool? 
Or maybe is there a way to use drain the same as maint, so that can also 
consider persistant connection to be forcefully routed only to active backends? 


Thanks, 

Matteo