Re: OpenSSL 1.1.1 vs 3.0 client cert verify "x509_strict" issues
Hello, On 12/12/2022 16:45, Froehlich, Dominik wrote: Hello HAproxy community! We’ve recently updated from OpenSSL 1.1.1 to OpenSSL 3.0 for our HAproxy deployment. We are now seeing some client certificates getting denied with these error messages: “*SSL client CA chain cannot be verified”/“error:0A86:SSL routines::certificate verify failed*” 30/0A86 We found out that for this CA certificate, the error was X509_V_ERR_MISSING_SUBJECT_KEY_IDENTIFIER This error is only thrown if we run openssl verify with the “-x509_strict” option. The same call (even with the “-x509_strict” option) on OpenSSL 1.1.1 returned OK and verified. Indeed, OpenSSL extended what the x509_strict option actually does in order to follow the requirements described in RFC 5280. OpenSSL's commit 0e071fbce4 gives a detailed list of the extra checks performed when x509_strict is set. As this was a bit surprising to us and we now have a customer who can’t use their client certificate anymore, we wanted to ask for some details on the OpenSSL verify check in HAproxy: * How does HAproxy call the “verify” command in OpenSSL? Actual certificate and certificate chain verification is performed inside OpenSSL so any default behavior change in OpenSSL itself might have an impact on which certificate we reject or not. * Does HAproxy use the “x509_strict” option programmatically? * Is there a flag in HAproxy that would allow us to temporarily disable the “strict” setting so that the customer has time to update their PKI? I did not try to reproduce the problem you encountered yet but you might have success with a proper crt-ignore-err and ca-ignore-err combination (on HAProxy's side). It does not disable strict checking per se but it could allow you to accept certificates that were otherwise rejected. * If there is no flag, we could temporarily patch out the code that uses the flag, can you give us some pointers? Thanks a lot for your help! Dominik Froehlich, SAP Hope this helps. Rémi LB
OpenSSL 1.1.1 vs 3.0 client cert verify "x509_strict" issues
Hello HAproxy community! We’ve recently updated from OpenSSL 1.1.1 to OpenSSL 3.0 for our HAproxy deployment. We are now seeing some client certificates getting denied with these error messages: “SSL client CA chain cannot be verified”/“error:0A86:SSL routines::certificate verify failed” 30/0A86 We found out that for this CA certificate, the error was X509_V_ERR_MISSING_SUBJECT_KEY_IDENTIFIER This error is only thrown if we run openssl verify with the “-x509_strict” option. The same call (even with the “-x509_strict” option) on OpenSSL 1.1.1 returned OK and verified. As this was a bit surprising to us and we now have a customer who can’t use their client certificate anymore, we wanted to ask for some details on the OpenSSL verify check in HAproxy: * How does HAproxy call the “verify” command in OpenSSL? * Does HAproxy use the “x509_strict” option programmatically? * Is there a flag in HAproxy that would allow us to temporarily disable the “strict” setting so that the customer has time to update their PKI? * If there is no flag, we could temporarily patch out the code that uses the flag, can you give us some pointers? Thanks a lot for your help! Dominik Froehlich, SAP
Re: Reproducible CI build with OpenSSL and "latest" keyword
On Mon, Dec 12, 2022 at 07:27:59PM +0500, Илья Шипицин wrote: > I attached a patch. > Thanks! > btw, we only build for the latest LibreSSL. are we ok to skip LibreSSL for > stable branches ? > In <= 2.5 we are still building with 3.5.3, http://git.haproxy.org/?p=haproxy-2.5.git;a=blob;f=.github/matrix.py;hb=HEAD#l132 Ideally it would be better to still build libreSSL in stable. In my opinion there should be at least one version + the latest for this method to work, but if the latest is equal to an already built version that doesn't make sens to build it again. -- William Lallemand
Re: Reproducible CI build with OpenSSL and "latest" keyword
I attached a patch. btw, we only build for the latest LibreSSL. are we ok to skip LibreSSL for stable branches ? the remaining feature requests might be addressed later, I hope пн, 12 дек. 2022 г. в 13:03, William Lallemand : > On Mon, Dec 12, 2022 at 08:48:06AM +0100, William Lallemand wrote: > > Hi Ilya ! > > > > On Mon, Dec 12, 2022 at 10:56:11AM +0500, Илья Шипицин wrote: > > > hello, > > > > > > I made some prototype of I meant: > > > > > > > https://github.com/chipitsine/haproxy/commit/c95955ecfd1a5b514c235b0f155bfa71178b51d5 > > > > > > > - We don't often use "dev" in our branches so we should build everything > > when it's not a stable branch. > > > > - We don't want to build "3.0" OR latest, in fact we only need to > > condition the "latest" build, because the other one will always be > > built. > > > > So once the "3.1" is released we could add an entry for it to > > the file and "latest" will be another version. This way we could > > backport the "3.1" in previous branches if we want to support it. > > > > > I;m not sure how stable branches are named in private github ci. If > you can > > > enlighten me, I'll try to adopt. > > > currently, I did the following, if branch name is either master or > contains > > > "dev", so "latest" semantic is chosen, fixed versions are used > otherwise. > > > > > > > The stable branches are named "haproxy-X.Y", so in my opinion we should > > build the "latest" for anything which is not a stable branch. > > > > > also, I know that the same ci is used for > > > > > > https://github.com/haproxytech/quic-dev > > > > > > > > > @Frederic Lecaille , which behaviour would > you like > > > for that repo ? what is branch naming convention ? > > > > > The same as the master branch IMHO. > > > > Also, the problem is uglier than I thought, we are not testing 1.1.1 > > anymore since "ubuntu-latest" was upgraded to 22.04 a few weeks ago > > without us noticing. "ssl=stock" is now a 3.0 branch. It brokes all > > stable branches below 2.6 because they need the deprecated SSL API. > > I changed "ubuntu-latest" to "ubuntu-20.04" for those branches so it > > works as earlier. I'm going to reintroduce "1.1.1" for master to 2.6 so > > it is correctly tested again. > > > > In my opinion we need a similar mecanism for the distribution than for > > the ssl libs. Maybe using "latest" only in dev branches and a fixed > > version for stable branches will be enough. > > > > Regards, > > > > Just thought about something, is it possible to have the versions in the > job names ? So we don't have surprises. For example the Ubuntu version > which was resolved by "ubuntu-latest" and the SSL version of > "ssl=stock", we could easily see the changes this way. > > -- > William Lallemand > From d3056da0e532914fca7ff0936be34d3df3e94602 Mon Sep 17 00:00:00 2001 From: Ilya Shipitsin Date: Mon, 12 Dec 2022 19:15:22 +0500 Subject: [PATCH] CI: split ssl lib selection based on git branch when *SSL_VERSION="latest" behaviour was introduced, it seems to be fine for development branches, but too intrusive for stable branches. let us limit "latest" semantic only for development builds, if branch name contains "haproxy-" it is supposed to be stable branch, no latest openssl should be taken --- .github/matrix.py | 10 -- .github/workflows/vtest.yml | 2 +- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/.github/matrix.py b/.github/matrix.py index 98d0a1f2a..fd9491aee 100755 --- a/.github/matrix.py +++ b/.github/matrix.py @@ -15,12 +15,12 @@ import re from os import environ if len(sys.argv) == 2: -build_type = sys.argv[1] +ref_name = sys.argv[1] else: -print("Usage: {} ".format(sys.argv[0]), file=sys.stderr) +print("Usage: {} ".format(sys.argv[0]), file=sys.stderr) sys.exit(1) -print("Generating matrix for type '{}'.".format(build_type)) +print("Generating matrix for type '{}'.".format(ref_name)) def clean_os(os): @@ -129,11 +129,9 @@ for CC in ["gcc", "clang"]: "stock", "OPENSSL_VERSION=1.0.2u", "OPENSSL_VERSION=1.1.1s", -"OPENSSL_VERSION=latest", -"LIBRESSL_VERSION=latest", "QUICTLS=yes", #"BORINGSSL=yes", -]: +] + (["OPENSSL_VERSION=latest", "LIBRESSL_VERSION=latest"] if "haproxy-" not in ref_name else []): flags = ["USE_OPENSSL=1"] if ssl == "BORINGSSL=yes" or ssl == "QUICTLS=yes" or "LIBRESSL" in ssl: flags.append("USE_QUIC=1") diff --git a/.github/workflows/vtest.yml b/.github/workflows/vtest.yml index fb7b1d968..a7cdcc514 100644 --- a/.github/workflows/vtest.yml +++ b/.github/workflows/vtest.yml @@ -26,7 +26,7 @@ jobs: - uses: actions/checkout@v3 - name: Generate Build Matrix id: set-matrix -run: python3 .github/matrix.py "${{ github.event_name }}" +run: python3 .github/matrix.py "${{ github.ref_name }}" # The Test job actually runs the tests. Test: -- 2.38.1
Re: Theoretical limits for a HAProxy instance
Hi, On Mon, 2022-12-12 at 09:47 +0100, Iago Alonso wrote: > Can you share haproxy -vv output ? > HAProxy config: > global > log /dev/log len 65535 local0 warning > chroot /var/lib/haproxy > stats socket /run/haproxy-admin.sock mode 660 level admin > user haproxy > group haproxy > daemon > maxconn 200 > maxconnrate 2500 > maxsslrate 2500 From your graphs (haproxy_process_current_ssl_rate / haproxy_process_current_connection_rate) you might hit maxconnrate/maxsslrate -Jarno -- Jarno Huuskonen
Vertical scaling of HAProxy instances
Hello, We are trying to vertically scale our HAProxy instances, and we are not getting the results that one would expect by upgrading the hardware (assuming that the software can take advantage of the extra resources). We upgraded from machines with 16 threads, to machines with 32 threads, and we are only observing a 50% increase in the ability to sustain connections and rps, as well as SSL rate, and we can’t seem to reach that rate before we overload the server. I’ve recently posted about “Theoretical limits for a HAProxy instance”, where I used the "Small" server as an example for the limits we were observing. I am using the same metrics here. We performed the same test in a bigger server with production traffic, but raising the maxsslrate and maxconnrate, from 2500 to 5000. "Small" server specs: CPU: AMD Ryzen 7 3700X 8-Core Processor (16 threads) RAM: DDR4 64GB (2666 MT/s) "Big" server specs: CPU: AMD Ryzen 9 5950X 16-Core Processor (32 threads) RAM: DDR4 128GB (2666 MT/s) This is the post on discourse, where I posted some of our Prometheus metrics https://discourse.haproxy.org/t/vertical-scaling-of-haproxy-instances/8190 . We are wondering if: - Are these results expected? - Does anyone with a similar setup/config get different results? Thanks in advance.
Theoretical limits for a HAProxy instance
Hello, We are performing a lot of load tests, and we hit what we think is an artificial limit of some sort, or a parameter that we are not taking into account (HAProxy config setting, kernel parameter…). We are wondering if there’s a known limit on what HAProxy is able to process, or if someone has experienced something similar, as we are thinking about moving to bigger servers, and we don’t know if we will observe a big difference. When trying to perform the load test in production, we observe that we can sustain 200k connections, and 10k rps, with a load1 of about 10. The maxsslrate and maxsslconn are maxed out, but we handle the requests fine, and we don’t return 5xx. Once we increase the load just a bit and hit 11k rps and about 205k connections, we start to return 5xx and we rapidly decrease the load, as these are tests against production. Production server specs: CPU: AMD Ryzen 7 3700X 8-Core Processor (16 threads) RAM: DDR4 64GB (2666 MT/s) When trying to perform a load test with synthetic tests using k6 as our load generator against staging, we are able to sustain 750k connections, with 20k rps. The load generator has a ramp-up time of 120s to achieve the 750k connections, as that’s what we are trying to benchmark. Staging server specs: CPU: AMD Ryzen 5 3600 6-Core Processor (12 threads) RAM: DDR4 64GB (3200 MT/s) I've made a post about this on discourse, and I got the suggestion to post here. In said post, I've included screenshots of some of our Prometheus metrics. https://discourse.haproxy.org/t/theoretical-limits-for-a-haproxy-instance/8168 Custom kernel parameters: net.ipv4.ip_local_port_range = "1276860999" net.nf_conntrack_max = 500 fs.nr_open = 500 HAProxy config: global log /dev/log len 65535 local0 warning chroot /var/lib/haproxy stats socket /run/haproxy-admin.sock mode 660 level admin user haproxy group haproxy daemon maxconn 200 maxconnrate 2500 maxsslrate 2500 defaults log global option dontlognull timeout connect 10s timeout client 120s timeout server 120s frontend stats mode http bind *:8404 http-request use-service prometheus-exporter if { path /metrics } stats enable stats uri /stats stats refresh 10s frontend k8s-api bind *:6443 mode tcp option tcplog timeout client 300s default_backend k8s-api backend k8s-api mode tcp option tcp-check timeout server 300s balance leastconn default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 500 maxqueue 256 weight 100 server master01 x.x.x.x:6443 check server master02 x.x.x.x:6443 check server master03 x.x.x.x:6443 check retries 0 frontend k8s-server bind *:80 mode http http-request add-header X-Forwarded-Proto http http-request add-header X-Forwarded-Port 80 default_backend k8s-server backend k8s-server mode http balance leastconn option forwardfor default-server inter 10s downinter 5s rise 2 fall 2 check server worker01a x.x.x.x:31551 maxconn 20 server worker02a x.x.x.x:31551 maxconn 20 server worker03a x.x.x.x:31551 maxconn 20 server worker04a x.x.x.x:31551 maxconn 20 server worker05a x.x.x.x:31551 maxconn 20 server worker06a x.x.x.x:31551 maxconn 20 server worker07a x.x.x.x:31551 maxconn 20 server worker08a x.x.x.x:31551 maxconn 20 server worker09a x.x.x.x:31551 maxconn 20 server worker10a x.x.x.x:31551 maxconn 20 server worker11a x.x.x.x:31551 maxconn 20 server worker12a x.x.x.x:31551 maxconn 20 server worker13a x.x.x.x:31551 maxconn 20 server worker14a x.x.x.x:31551 maxconn 20 server worker15a x.x.x.x:31551 maxconn 20 server worker16a x.x.x.x:31551 maxconn 20 server worker17a x.x.x.x:31551 maxconn 20 server worker18a x.x.x.x:31551 maxconn 20 server worker19a x.x.x.x:31551 maxconn 20 server worker20a x.x.x.x:31551 maxconn 20 server worker01an x.x.x.x:31551 maxconn 20 server worker02an x.x.x.x:31551 maxconn 20 server worker03an x.x.x.x:31551 maxconn 20 retries 0 frontend k8s-server-https bind *:443 ssl crt /etc/haproxy/certs/ mode http http-request add-header X-Forwarded-Proto https http-request add-header X-Forwarded-Port 443 http-request del-header X-SERVER-SNI http-request set-header X-SERVER-SNI %[ssl_fc_sni] if { ssl_fc_sni -m found } http-request set-var(txn.fc_sni) hdr(X-SERVER-SNI) if { hdr(X-SERVER-SNI) -m found } http-request del-header X-SERVER-SNI default_backend k8s-server-https backend k8s-server-https mode http balance leastconn option forwardfor default-server inter 10s downinter 5s rise 2 fall 2 check no-check-ssl server worker01a x.x.x.x:31445 ssl ca-file /etc/haproxy/ca/ca.crt sni var(txn.fc_sni) maxconn 20 server worker02a x.x.x.x:31445 ssl
Re: Reproducible CI build with OpenSSL and "latest" keyword
On Mon, Dec 12, 2022 at 08:48:06AM +0100, William Lallemand wrote: > Hi Ilya ! > > On Mon, Dec 12, 2022 at 10:56:11AM +0500, Илья Шипицин wrote: > > hello, > > > > I made some prototype of I meant: > > > > https://github.com/chipitsine/haproxy/commit/c95955ecfd1a5b514c235b0f155bfa71178b51d5 > > > > - We don't often use "dev" in our branches so we should build everything > when it's not a stable branch. > > - We don't want to build "3.0" OR latest, in fact we only need to > condition the "latest" build, because the other one will always be > built. > > So once the "3.1" is released we could add an entry for it to > the file and "latest" will be another version. This way we could > backport the "3.1" in previous branches if we want to support it. > > > I;m not sure how stable branches are named in private github ci. If you can > > enlighten me, I'll try to adopt. > > currently, I did the following, if branch name is either master or contains > > "dev", so "latest" semantic is chosen, fixed versions are used otherwise. > > > > The stable branches are named "haproxy-X.Y", so in my opinion we should > build the "latest" for anything which is not a stable branch. > > > also, I know that the same ci is used for > > > > https://github.com/haproxytech/quic-dev > > > > > > @Frederic Lecaille , which behaviour would you like > > for that repo ? what is branch naming convention ? > > > The same as the master branch IMHO. > > Also, the problem is uglier than I thought, we are not testing 1.1.1 > anymore since "ubuntu-latest" was upgraded to 22.04 a few weeks ago > without us noticing. "ssl=stock" is now a 3.0 branch. It brokes all > stable branches below 2.6 because they need the deprecated SSL API. > I changed "ubuntu-latest" to "ubuntu-20.04" for those branches so it > works as earlier. I'm going to reintroduce "1.1.1" for master to 2.6 so > it is correctly tested again. > > In my opinion we need a similar mecanism for the distribution than for > the ssl libs. Maybe using "latest" only in dev branches and a fixed > version for stable branches will be enough. > > Regards, > Just thought about something, is it possible to have the versions in the job names ? So we don't have surprises. For example the Ubuntu version which was resolved by "ubuntu-latest" and the SSL version of "ssl=stock", we could easily see the changes this way. -- William Lallemand