Hello, Sorry for my lengthy post, but I wanted to give as much info upfront as possible, since it takes a bunch of guesswork out of it!
I've recently started testing a combo of HAProxy 3.1 and Varnish 7.6 for some content delivery / offloading, and I'm a bit curious if people have any data/suggestions/optimizations that can be done to push things further in terms of performance. I tried to use the HAProxy PPA for Ubuntu 24.04 (noble) ( https://launchpad.net/~vbernat/+archive/ubuntu/haproxy-3.1 ), thanks Vincent for providing these! The build from the PPA uses the OS distributed OpenSSL, which is 3.0.13 on Ubuntu 24.04. I also have a custom build where I compiled in AWS-LC version 1.42. PPA distribution: Build options : TARGET = linux-glibc CC = x86_64-linux-gnu-gcc CFLAGS = -O2 -g -fwrapv -g -O2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -flto=auto -ffat-lto-objects -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -fdebug-prefix-map=/build/haproxy-ScKxv0/haproxy-3.1.1=/usr/src/haproxy-3.1.1-1ppa1~noble -Wdate-time -D_FORTIFY_SOURCE=3 OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_OT=1 USE_QUIC=1 USE_PROMEX=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_QUIC_OPENSSL_COMPAT=1 DEBUG = Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL +OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION +QUIC +QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB My own build: Build options : TARGET = linux-glibc CC = cc CFLAGS = -O2 -g -fwrapv OPTIONS = USE_OPENSSL_AWSLC=1 USE_SLZ=1 USE_QUIC=1 USE_PROMEX=1 USE_PCRE2=1 USE_PCRE2_JIT=1 DEBUG = Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY -LUA -MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL +OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION +QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB I know there's quite a few differences in the CFlags, but I went with it anyway! Test System: - E5-2698v4 (20 cores, 40 threads) - 128GB of 2133MHz DDR4 RAM (16GB DIMMs, in the correct banks) - Ubuntu 24.04 with generic kernel Haproxy config: global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy maxconn 50000 daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256 ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http frontend ft_web bind *:80,:::80 v6only bind *:443,:::443 v6only ssl crt /etc/haproxy/ssl/ bind quic4@:443 ssl crt /etc/haproxy/ssl/ alpn h3 bind quic6@:443 ssl crt /etc/haproxy/ssl/ alpn h3 http-after-response add-header alt-svc 'h3=":443"; ma=60' option forwardfor no log http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } default_backend bk_varnish backend bk_varnish mode http no log server varnish1 127.0.0.1:8443 send-proxy The SSL certificate being used is a P-256 ECC certificate. h2load using 14 threads, 100k requests, 100 clients and a 10 concurrent streams per client, requesting a 2 megabyte jpeg image. It's worth noting with raw varnish (127.0.0.1:6081) I am able to push the system to 175Gbps of traffic being served out of Varnish via h2c, this obviously brings the challenge of having to run h2load on the same system, since I.. sadly do not have 2x 200G connected servers! If I use haproxy for terminating SSL using OpenSSL 3.0, I am able to do 63.76Gbps of traffic over http2 (h2) Doing the same test using haproxy that uses AWS-LC v1.42, I get 64.21Gbps of traffic over http2 (h2) Repeating the tests, averaging them out, it ends up being within margin of error of each other. Question is, whether this is to be expected, that they'll perform roughly the same in semi-large chunks of data (2MB), or if there's some key CFlags options that I am missing to maybe push it a bit further >From the various posts on the interwebs, and the haproxy wiki on GitHub, it >seems that people in general do not recommend OpenSSL 3.0 due to it's bad >performance, and AWS-LC should perform quite a bit better. But is that largely for small files or should it be overall an improvement? Is there anything I could possibly do to push things further, target filesize being 2 megabyte, which is served straight out of memory from Varnish It's worth noting I did the same test on an EPYC 7502, and I get only marginally better performance, at around 68Gbps of SSL traffic (180Gbps of h2c via Varnish directly). I know 68Gbps also means doing that on localhost on top, due to the backend bk_varnish. Another thing I noted, while testing QUIC/HTTP3 as well, is that haproxy seems to consume about 4x as much resources to serve the same amount of traffic (currently ~ 2Gbps of actual traffic) than it does for h2, is that something that's likely to improve in the future? Best Regards, Lucas Rolff