Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)
On Thu, Nov 2, 2023 at 10:12 PM Tristan wrote: > It's worth checking if your build has the commit that fixes this SSL > crash that was on 2.9-dev8 > https://github.com/haproxy/haproxy/issues/2329 (the fix is on master > already but it might not be in your build?) My build is on the literal 2.9-dev8 tag. Will rebuild with the latest master, to include commit e7bae7a. Thank you! Valters Jansons
Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)
Hi, On 02/11/2023 19:31, Valters Jansons wrote: On Tue, Oct 24, 2023 at 10:35 AM Willy Tarreau wrote: We are running 2.9-dev8 for the server connection close fix for "not-so-great" gRPC clients. We just experienced an ha_panic seemingly triggered from OpenSSL 3. This is a fairly default Ubuntu 22.04 system, with locally built HAProxy package (as there are no "official" dev builds). It's worth checking if your build has the commit that fixes this SSL crash that was on 2.9-dev8 https://github.com/haproxy/haproxy/issues/2329 (the fix is on master already but it might not be in your build?) Regards, Tristan
Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)
On Tue, Oct 24, 2023 at 10:35 AM Willy Tarreau wrote: > > We are running 2.9-dev8 for the server connection close fix for > > "not-so-great" gRPC clients. We just experienced an ha_panic seemingly > > triggered from OpenSSL 3. This is a fairly default Ubuntu 22.04 > > system, with locally built HAProxy package (as there are no "official" > > dev builds). > > Hmm that's not cool. Did it happen only once or repeatedly ? We are running multiple load balancers, and it was just one host that experienced this strange issue once. I wanted to bring it up here in the list, as it happened right after we deployed 2.9-dev8, but we have not seen the same issue again since that one time. So it was most likely some strange OpenSSL 3 edge case as you said. > It's very possible, indeed. Do you have SSL on the frontend only or also > on the backend ? SSL is not used on the backend for this scenario. > Also, is you machine heavily loaded or not ? I'm trying to estimate if > it's worth switching to alternate locks in your case. In case you're > interested in giving it a try, you can add USE_PTHREAD_EMULATION=1 to > your "make" command line. It may seem to use more CPU but will in fact > replace the sleeping wait by an active wait and for a shorter time, > resulting in faster processing and a real (not just apparent) load > reporting. The system has some other services running on it (not only load balancing) so it is not sitting idle and sees periodic load, but it's a powerful box and the load experienced shouldn't be a concern in my eyes. I went ahead and rebuilt with `USE_PTHREAD_EMULATION=1`. We run Ubuntu 22.04 across our fleet and plan to keep it that way, even with the OpenSSL 3 drama. The rebuilt binary has behaved stable -- thank you for bringing attention to the flag. Overall, feels strange, but doesn't seem like there is anything actionable here in the end. Thank you in any case! Valters Jansons
Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)
On Tue, Oct 24, 2023 at 02:03:03AM +0300, Valters Jansons wrote: > Hello, > Hello, > > We are running 2.9-dev8 for the server connection close fix for > "not-so-great" gRPC clients. We just experienced an ha_panic seemingly > triggered from OpenSSL 3. This is a fairly default Ubuntu 22.04 > system, with locally built HAProxy package (as there are no "official" > dev builds). > There is a list of packages available there: https://github.com/haproxy/wiki/wiki/Packages Specifically I maintain a build for ubuntu and debian, based of the latest commit of the master branch, the build is trigered for each push. You can install them from here: https://software.opensuse.org/download/package?package=haproxy&project=home%3Awlallemand The package is based on the debian one, here the build options: https://github.com/wlallemand/haproxy-nightly-build/blob/master/debian/rules#L10 -- William Lallemand
Re: 2.9-dev8: ha_panic from libcrypto/libssl (Ubuntu 22.04, OpenSSL 3.0.2)
Hello Valters, On Tue, Oct 24, 2023 at 02:03:03AM +0300, Valters Jansons wrote: > Hello, > > The trace log is uploaded at > https://gist.github.com/sigv/58a5d148579c75d39b2b7c76a3254fa5 > > We are running 2.9-dev8 for the server connection close fix for > "not-so-great" gRPC clients. We just experienced an ha_panic seemingly > triggered from OpenSSL 3. This is a fairly default Ubuntu 22.04 > system, with locally built HAProxy package (as there are no "official" > dev builds). Hmm that's not cool. Did it happen only once or repeatedly ? From what I'm seeing, one of the SSL library calls froze the thread for 4 seconds without making progress. Given openssl 3's extreme abuse of locking, it sounds perfectly possible that under load one such thread never manages to make progress and repeatedly fails. > Our SSL/TLS configuration is fairly basic too. I do not think it > contributes to the issue on hand. On bind we have `strict-sni`, a > `crt-list` specified and `alpn h2,http/1.1`. > > ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets > ssl-default-bind-ciphers > ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 > ssl-default-bind-ciphersuites > TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256 I'm not fluent in this but I'm not seeing any excentricities there. > In our log, we have some "SSL handshake failure" lines and some more > detailed "SSL handshake failure (error:0A00010B:SSL routines::wrong > version number)" lines. I presume these are not related -- instead > being caused by some clients potentially connecting to port 443 and > trying to talk plaintext, or wanting to run TLS 1.1 or older. It's very possible, indeed. Do you have SSL on the frontend only or also on the backend ? I'm asking because openssl3 is very bad on the frontend but it's close to unusable at all on the backend. We've seen configs saturate the CPU using only health checks! Also, is you machine heavily loaded or not ? I'm trying to estimate if it's worth switching to alternate locks in your case. In case you're interested in giving it a try, you can add USE_PTHREAD_EMULATION=1 to your "make" command line. It may seem to use more CPU but will in fact replace the sleeping wait by an active wait and for a shorter time, resulting in faster processing and a real (not just apparent) load reporting. Otherwise, if your SSL load is high, particularly on the backend, you may need to switch back to a distro featuring openssl 1.1.1 (such as ubuntu 20.04 for example). Regards, Willy

