Re: Followup on openssl 3.0 note seen in another thread
On 5/29/23 20:38, Willy Tarreau wrote: Have you verified that the CPU is saturated ? The CPU on the machine running the test settles at about 1800 percent for my test program. 12 real cores, hyperthreaded. The CPU on the frontend haproxy process is barely breathing hard. Never saw it get above 150%. That server has 24 real cores. The CPU on the backend haproxy running on the raspberry pi hovers between 250 and 280%. It's a 3B, so it has four CPU cores. Those CPU values gathered with the test program running 24 threads with quictls 1.1.1t. With 200 threads, the CPU usage on all 3 systems is even lower. So I would say I am not saturating the CPU. I need a different test methodology ... this Java program is not really doing much to haproxy. Without keep-alive nor TLS resume, you should see roughly 1000 connections per second per core, and with TLS resume you should see roughly 4000 conns/s per core. So if you have 12 cores you should see 12000 or 48000 conns/s depending if you're using TLS resume or full rekey. It's doing whatever Apache's httpclient does with Java's TLS. I know it's not doing keepalive, I explicitly pass the connection close header. I do not know if it uses TLS resume or not, and I do not know how to discover that info. I'm not seeing anywhere near that connection rate. Not even with an haproxy backend. Hmmm are you sure you didn't build the client with OpenSSL 3.0 ? I'm asking because that was our first concern when we tested the perf on Intel's SPR machine. No way to go beyond 400 conn/s, with haproxy totally idle and the client at 100% on 48 cores... The cause was OpenSSL 3. Rebuilding under 1.1.1 jumped to 74000, almost 200 times more! The client is a Java program running in Java 11, with nothing to have it use anything but Java's TLS. It should not be using any version of openssl. https://asciinema.elyograg.org/haproxyssltest1.html Hmmm host not found here. Oops. I did not get that name in my public DNS. Fixed. The run it shows is from earlier, before I set up a backend running haproxy. That run is using 200 threads. When it ends, it reports the connection rate at 244.69 per second. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On Sat, May 27, 2023 at 02:56:39PM -0600, Shawn Heisey wrote: > On 5/27/23 02:59, Willy Tarreau wrote: > > The little difference makes me think you've sent your requests over > > a keep-alive connection, which is fine, but which doesn't stress the > > TLS stack anymore. > > Yup. It was using keepalive. I turned keepalive off and repeated the > tests. > > I'm still not seeing a notable difference between the branches, so I have to > wonder whether I need a completely different test. Or whether I simply > don't need to worry about it at all because my traffic needs are so small. Have you verified that the CPU is saturated ? > Requests per second is down around 60 instead of 1200, and the request time > percentile values went up. At such a low performance it's unlikely that you could hurt the CPU at all, I suspect the limiting factor is the load generator (or there's something else). > I've included two runs per branch here. 24 > threads, each doing 1000 requests. The haproxy logs indicate the page I'm > hitting returns 829 bytes, while the actual index.html is 1187 bytes. I > think gzip compression and the HTTP headers explains the difference. > Without keepalive, the overall test takes a lot longer, which is not > surprising. Without keep-alive nor TLS resume, you should see roughly 1000 connections per second per core, and with TLS resume you should see roughly 4000 conns/s per core. So if you have 12 cores you should see 12000 or 48000 conns/s depending if you're using TLS resume or full rekey. Hmmm are you sure you didn't build the client with OpenSSL 3.0 ? I'm asking because that was our first concern when we tested the perf on Intel's SPR machine. No way to go beyond 400 conn/s, with haproxy totally idle and the client at 100% on 48 cores... The cause was OpenSSL 3. Rebuilding under 1.1.1 jumped to 74000, almost 200 times more! > The high percentiles are not encouraging. 7 seconds to get a web page under > 1kb?, even with 1.1.1t? > > This might be interesting to someone: > > https://asciinema.elyograg.org/haproxyssltest1.html Hmmm host not found here. > I put the project in github. > > https://github.com/elyograg/haproxytestssl I'm seeing everything being done in doGet() but I have no idea about the overhead of the allocations there nor the cost of the lower layers. Maybe there's even some DNS resolution involved, I don't know. That's exactly what I don't like with such languages, they come with tons of pre-defined functions to do whatever but you have no idea how they do them so in the end you don't know what you're testing. Please do me a favor and verify two things: - check the CPU usage using "top" on the haproxy machine during the test - check the CPU usage using "top" on the load geneator machine during the test Until you reach 100% on haproxy you're measuring something else. Please do a comparative check using h1load from a machine having openssl 1.1.1 (e.g. ubuntu 20): git clone https://github.com/wtarreau/h1load/ cd h1load make -j ./h1load -t $(nbproc) -c 240 -r 1 --tls-reuse https://hostname/path This will create 240 concurrent connections to the server, without keep-alive (-r 1 = 1 request per connection), with TLS session resume, and using as many threads as you have CPU cores. You'll see the number of connections per second in the cps column, and the number of requests per second in the rps column. On the left column you'll see the instant number of connections, and on the right you'll see the response time in milliseconds. And please do check that this time the CPU is saturated either on haproxy or on the client. If you have some network latency between the two, you may need to increase the number of connections. You can drop "-r 1" if you want to test with keep-alive. Or you can drop --tls-reuse if you want to test the rekeying performance (for sites that take many new clients making few requests). You can also limit the total number of requests using "-n 24000" for example. Better make sure this number is an integral multiple of the number of connections, even though this is not mandatory at least it's cleaner. Similarly it's better if the number of connections (-c) is an integral multiple of the number of threads (-t) so that each thread is equally loaded. Willy
Re: Followup on openssl 3.0 note seen in another thread
On 5/29/23 01:43, Aleksandar Lazic wrote: HAProxies FE => HAProxies BE => Destination Servers Where the Destination Servers are also HAProxies which just returns a static content or any high performance low latency HTTPS Server. With such a Setup can you test also the Client mode of the OpenSSL. Oops. Mistype sent that message before I could finish it. Interesting idea. I set up haproxy on raspberry pi and configured it to serve a static web page with https. Running the same version of haproxy on both the main server and the raspi, running with the same version of quictls. https://raspi1.elyograg.org Side note: compiling and installing quictls and haproxy is a lot slower on a raspberry pi than on a dell server. 84 seconds on the dell server and 2591 seconds on the pi. Make gets 12 threads on the server, 2 on the pi ... I give it half of the physical core count, rounded up to 2. It took a while to get this info due to the slow compile speeds on the pi. I wish build systems could give me an accurate estimate of how far done the build is. The quictls one doesn't say ANYTHING. The requests are taking more time in general. This is due to another round trip (including TLS) from the server to the raspberry pi that did not occur before. With the other URL, it was forwarding to Apache on the same server, port 81 without TLS. I still wouldn't call it a smoking gun, but this test shows evidence of 1.1 handling the concurrency better than 3.0. 1.1.1t: 20:31:21.177 [main] INFO o.e.t.h.MainSSLTest Count 24000 310.31/s 20:31:21.177 [main] INFO o.e.t.h.MainSSLTest 10th % 53 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest 25th % 60 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest Median 69 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest 75th % 81 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest 95th % 125 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest 99th % 163 ms 20:31:21.178 [main] INFO o.e.t.h.MainSSLTest 99.9 % 633 ms 3.0.8: 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest Count 24000 290.48/s 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest 10th % 59 ms 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest 25th % 66 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest Median 75 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 75th % 87 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 95th % 123 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 99th % 161 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 99.9 % 1004 ms 3.1.0+locks: The quictls compile failed on the pi. So I couldn't test this one. I suppose I could have done it without TLS, but I didn't do that. Here's the log from the compile: /usr/bin/ld: unknown architecture of input file `libcrypto.a(libdefault-lib-pbkdf2_fips.o)' is incompatible with aarch64 output collect2: error: ld returned 1 exit status make[1]: *** [Makefile:22146: fuzz/cmp-test] Error 1 make[1]: *** Waiting for unfinished jobs /usr/bin/ld: unknown architecture of input file `libcrypto.a(libdefault-lib-pbkdf2_fips.o)' is incompatible with aarch64 output collect2: error: ld returned 1 exit status make[1]: *** [Makefile:22270: fuzz/punycode-test] Error 1 make: *** [Makefile:3278: build_sw] Error 2 I wonder why that happened. 1.1.1t and 3.0.8 compiled just fine. All three work on x86_64. I should set up my third server to serve the static page from haproxy. It's x86_64. Maybe when I find all that free time I am looking for! Slightly interesting detail, not sure what it means: The backend for haproxy on the pi shows L6OK on the stats page instead of L7OK like all the other backends. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 5/29/23 19:52, Shawn Heisey wrote: Interesting idea. So sorry. I was writing up the new reply, and my fingers got confused for a moment, accidentally did Ctrl-Enter which tells Thunderbird to send the message. Will send a new complete reply.
Re: Followup on openssl 3.0 note seen in another thread
On 5/29/23 01:43, Aleksandar Lazic wrote: HAProxies FE => HAProxies BE => Destination Servers Where the Destination Servers are also HAProxies which just returns a static content or any high performance low latency HTTPS Server. With such a Setup can you test also the Client mode of the OpenSSL. Interesting idea. I set up haproxy on raspberry pi and configured it to serve a static web page with https. Running the same version of haproxy on both the main server and the raspi, running with the same version of quictls. https://raspi1.elyograg.org Side note: compiling and installing quictls and haproxy is a lot slower on a raspberry pi than on a dell server. 84 seconds on the dell server and 2591 seconds on the pi. Make gets 12 threads on the server, 2 on the pi ... I give it half of the physical core count, rounded up to 2. It took a while to get this info due to the slow compile speeds on the pi. I wish build systems could give me an accurate estimate of how far done the build is. The quictls one doesn't say ANYTHING. The requests are taking more time in general. This is due to another round trip (including SSL) from the server to the raspberry pi that did not occur before. With the other URL, it was forwarding to Apache on the same server, port 81 without ssl. 1.1.t: 3.0.8: 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest Count 24000 290.48/s 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest 10th % 59 ms 19:22:12.281 [main] INFO o.e.t.h.MainSSLTest 25th % 66 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest Median 75 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 75th % 87 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 95th % 123 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 99th % 161 ms 19:22:12.282 [main] INFO o.e.t.h.MainSSLTest 99.9 % 1004 ms 3.1.0+locks: Couldn't do this one. Compile fails:
Re: Followup on openssl 3.0 note seen in another thread
Hi Shawn. On 2023-05-28 (So.) 05:30, Shawn Heisey wrote: On 5/27/23 18:03, Shawn Heisey wrote: On 5/27/23 14:56, Shawn Heisey wrote: Yup. It was using keepalive. I turned keepalive off and repeated the tests. I did the tests again with 200 threads. The system running the tests has 12 hyperthreaded cores, so this definitely pushes its capabilities. I had forgotten a crucial fact that means all my prior testing work was invalid: Apache HttpClient 4.x defaults to a max simultaneous connection count of 2. Not going to exercise concurrency with that! I have increased that to 1024, my program's max thread count, and now the test is a LOT faster ... it's actually running 200 threads at the same time. Two runs per branch here, one with 200 threads and one with 24 threads. Still no smoking gun showing 3.0 as the slowest of the bunch. In fact, 3.0 is giving the best results! So my test method is still probably the wrong approach. Maybe you can change the setup in that way HAProxies FE => HAProxies BE => Destination Servers Where the Destination Servers are also HAProxies which just returns a static content or any high performance low latency HTTPS Server. With such a Setup can you test also the Client mode of the OpenSSL. Regards Alex 1.1.1t: 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest Count 20 234.54/s 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest 10th % 54 ms 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest 25th % 94 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest Median 188 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 75th % 991 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 95th % 3698 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 99th % 6924 ms 21:06:45.390 [main] INFO o.e.t.h.MainSSLTest 99.9 % 11983 ms - 21:20:35.400 [main] INFO o.e.t.h.MainSSLTest Count 24000 355.56/s 21:20:35.400 [main] INFO o.e.t.h.MainSSLTest 10th % 40 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 25th % 46 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest Median 57 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 75th % 71 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 95th % 126 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 99th % 168 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 99.9 % 721 ms 3.0.8: 20:50:12.916 [main] INFO o.e.t.h.MainSSLTest Count 20 244.69/s 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 10th % 56 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 25th % 93 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest Median 197 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 75th % 949 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 95th % 3425 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 99th % 6679 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 99.9 % 11582 ms - 21:23:22.076 [main] INFO o.e.t.h.MainSSLTest Count 24000 404.78/s 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 10th % 40 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 25th % 45 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest Median 53 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 75th % 63 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 95th % 90 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 99th % 121 ms 21:23:22.078 [main] INFO o.e.t.h.MainSSLTest 99.9 % 671 ms 3.1.0+locks: 20:33:32.805 [main] INFO o.e.t.h.MainSSLTest Count 20 238.02/s 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 10th % 58 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 25th % 95 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest Median 196 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 75th % 1001 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 95th % 3475 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 99th % 6288 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 99.9 % 10700 ms - 21:26:24.555 [main] INFO o.e.t.h.MainSSLTest Count 24000 402.89/s 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 10th % 39 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 25th % 45 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest Median 52 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 75th % 64 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 95th % 93 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 99th % 127 ms 21:26:24.557 [main] INFO o.e.t.h.MainSSLTest 99.9 % 689 ms
Re: Followup on openssl 3.0 note seen in another thread
On 5/27/23 18:03, Shawn Heisey wrote: On 5/27/23 14:56, Shawn Heisey wrote: Yup. It was using keepalive. I turned keepalive off and repeated the tests. I did the tests again with 200 threads. The system running the tests has 12 hyperthreaded cores, so this definitely pushes its capabilities. I had forgotten a crucial fact that means all my prior testing work was invalid: Apache HttpClient 4.x defaults to a max simultaneous connection count of 2. Not going to exercise concurrency with that! I have increased that to 1024, my program's max thread count, and now the test is a LOT faster ... it's actually running 200 threads at the same time. Two runs per branch here, one with 200 threads and one with 24 threads. Still no smoking gun showing 3.0 as the slowest of the bunch. In fact, 3.0 is giving the best results! So my test method is still probably the wrong approach. 1.1.1t: 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest Count 20 234.54/s 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest 10th % 54 ms 21:06:45.388 [main] INFO o.e.t.h.MainSSLTest 25th % 94 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest Median 188 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 75th % 991 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 95th % 3698 ms 21:06:45.389 [main] INFO o.e.t.h.MainSSLTest 99th % 6924 ms 21:06:45.390 [main] INFO o.e.t.h.MainSSLTest 99.9 % 11983 ms - 21:20:35.400 [main] INFO o.e.t.h.MainSSLTest Count 24000 355.56/s 21:20:35.400 [main] INFO o.e.t.h.MainSSLTest 10th % 40 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 25th % 46 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest Median 57 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 75th % 71 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 95th % 126 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 99th % 168 ms 21:20:35.401 [main] INFO o.e.t.h.MainSSLTest 99.9 % 721 ms 3.0.8: 20:50:12.916 [main] INFO o.e.t.h.MainSSLTest Count 20 244.69/s 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 10th % 56 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 25th % 93 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest Median 197 ms 20:50:12.917 [main] INFO o.e.t.h.MainSSLTest 75th % 949 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 95th % 3425 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 99th % 6679 ms 20:50:12.918 [main] INFO o.e.t.h.MainSSLTest 99.9 % 11582 ms - 21:23:22.076 [main] INFO o.e.t.h.MainSSLTest Count 24000 404.78/s 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 10th % 40 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 25th % 45 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest Median 53 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 75th % 63 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 95th % 90 ms 21:23:22.077 [main] INFO o.e.t.h.MainSSLTest 99th % 121 ms 21:23:22.078 [main] INFO o.e.t.h.MainSSLTest 99.9 % 671 ms 3.1.0+locks: 20:33:32.805 [main] INFO o.e.t.h.MainSSLTest Count 20 238.02/s 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 10th % 58 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 25th % 95 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest Median 196 ms 20:33:32.806 [main] INFO o.e.t.h.MainSSLTest 75th % 1001 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 95th % 3475 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 99th % 6288 ms 20:33:32.807 [main] INFO o.e.t.h.MainSSLTest 99.9 % 10700 ms - 21:26:24.555 [main] INFO o.e.t.h.MainSSLTest Count 24000 402.89/s 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 10th % 39 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 25th % 45 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest Median 52 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 75th % 64 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 95th % 93 ms 21:26:24.556 [main] INFO o.e.t.h.MainSSLTest 99th % 127 ms 21:26:24.557 [main] INFO o.e.t.h.MainSSLTest 99.9 % 689 ms
Re: Followup on openssl 3.0 note seen in another thread
On 5/27/23 14:56, Shawn Heisey wrote: Yup. It was using keepalive. I turned keepalive off and repeated the tests. I did the tests again with 200 threads. The system running the tests has 12 hyperthreaded cores, so this definitely pushes its capabilities. The system running haproxy has 24 hyperthreaded cores. There is no thread or process info in haproxy.cfg. 200 threads takes so long to run that I didn't do multiple runs per branch. Any inconsistencies created by the fact that haproxy has just been restarted will hopefully be leveled out due to how long the run takes. The request times for 200 threads vs. 24 threads shows that the speed went down. I think I have definitely saturated the test system, and hopefully also the haproxy server. Still no smoking gun showing the lock problems in 3.0. I had hoped that would be apparent. 1.1.1t: 15:52:18.666 [main] INFO o.e.t.h.MainSSLTest Count 20 56.82/s 15:52:18.668 [main] INFO o.e.t.h.MainSSLTest 10th % 31 ms 15:52:18.668 [main] INFO o.e.t.h.MainSSLTest 25th % 47 ms 15:52:18.668 [main] INFO o.e.t.h.MainSSLTest Median 994 ms 15:52:18.669 [main] INFO o.e.t.h.MainSSLTest 75th % 4953 ms 15:52:18.669 [main] INFO o.e.t.h.MainSSLTest 95th % 14205 ms 15:52:18.669 [main] INFO o.e.t.h.MainSSLTest 99th % 23581 ms 15:52:18.669 [main] INFO o.e.t.h.MainSSLTest 99.9 % 37396 ms 3.0.8: 16:59:03.645 [main] INFO o.e.t.h.MainSSLTest Count 20 58.34/s 16:59:03.647 [main] INFO o.e.t.h.MainSSLTest 10th % 30 ms 16:59:03.648 [main] INFO o.e.t.h.MainSSLTest 25th % 35 ms 16:59:03.648 [main] INFO o.e.t.h.MainSSLTest Median 368 ms 16:59:03.648 [main] INFO o.e.t.h.MainSSLTest 75th % 4606 ms 16:59:03.648 [main] INFO o.e.t.h.MainSSLTest 95th % 14840 ms 16:59:03.649 [main] INFO o.e.t.h.MainSSLTest 99th % 25561 ms 16:59:03.649 [main] INFO o.e.t.h.MainSSLTest 99.9 % 40826 ms 3.1.1+locks: 18:01:04.198 [main] INFO o.e.t.h.MainSSLTest Count 20 56.69/s 18:01:04.198 [main] INFO o.e.t.h.MainSSLTest 10th % 31 ms 18:01:04.198 [main] INFO o.e.t.h.MainSSLTest 25th % 39 ms 18:01:04.199 [main] INFO o.e.t.h.MainSSLTest Median 455 ms 18:01:04.199 [main] INFO o.e.t.h.MainSSLTest 75th % 4759 ms 18:01:04.199 [main] INFO o.e.t.h.MainSSLTest 95th % 15071 ms 18:01:04.199 [main] INFO o.e.t.h.MainSSLTest 99th % 25729 ms 18:01:04.200 [main] INFO o.e.t.h.MainSSLTest 99.9 % 41308 ms
Re: Followup on openssl 3.0 note seen in another thread
On 5/27/23 02:59, Willy Tarreau wrote: The little difference makes me think you've sent your requests over a keep-alive connection, which is fine, but which doesn't stress the TLS stack anymore. Yup. It was using keepalive. I turned keepalive off and repeated the tests. I'm still not seeing a notable difference between the branches, so I have to wonder whether I need a completely different test. Or whether I simply don't need to worry about it at all because my traffic needs are so small. Requests per second is down around 60 instead of 1200, and the request time percentile values went up. I've included two runs per branch here. 24 threads, each doing 1000 requests. The haproxy logs indicate the page I'm hitting returns 829 bytes, while the actual index.html is 1187 bytes. I think gzip compression and the HTTP headers explains the difference. Without keepalive, the overall test takes a lot longer, which is not surprising. The high percentiles are not encouraging. 7 seconds to get a web page under 1kb?, even with 1.1.1t? This might be interesting to someone: https://asciinema.elyograg.org/haproxyssltest1.html I put the project in github. https://github.com/elyograg/haproxytestssl quictls branch: OpenSSL_1_1_1t+quic 14:15:57.496 [main] INFO o.e.t.h.MainSSLTest Count 24000 64.65/s 14:15:57.498 [main] INFO o.e.t.h.MainSSLTest 10th % 28 ms 14:15:57.499 [main] INFO o.e.t.h.MainSSLTest 25th % 28 ms 14:15:57.499 [main] INFO o.e.t.h.MainSSLTest Median 31 ms 14:15:57.499 [main] INFO o.e.t.h.MainSSLTest 75th % 65 ms 14:15:57.500 [main] INFO o.e.t.h.MainSSLTest 95th % 2690 ms 14:15:57.500 [main] INFO o.e.t.h.MainSSLTest 99th % 5058 ms 14:15:57.500 [main] INFO o.e.t.h.MainSSLTest 99.9 % 9342 ms - 14:22:19.922 [main] INFO o.e.t.h.MainSSLTest Count 24000 65.39/s 14:22:19.924 [main] INFO o.e.t.h.MainSSLTest 10th % 28 ms 14:22:19.924 [main] INFO o.e.t.h.MainSSLTest 25th % 28 ms 14:22:19.924 [main] INFO o.e.t.h.MainSSLTest Median 31 ms 14:22:19.925 [main] INFO o.e.t.h.MainSSLTest 75th % 62 ms 14:22:19.925 [main] INFO o.e.t.h.MainSSLTest 95th % 2683 ms 14:22:19.925 [main] INFO o.e.t.h.MainSSLTest 99th % 4978 ms 14:22:19.925 [main] INFO o.e.t.h.MainSSLTest 99.9 % 7291 ms quictls branch: openssl-3.1.0+quic+locks 13:15:28.901 [main] INFO o.e.t.h.MainSSLTest Count 24000 63.43/s 13:15:28.903 [main] INFO o.e.t.h.MainSSLTest 10th % 29 ms 13:15:28.903 [main] INFO o.e.t.h.MainSSLTest 25th % 29 ms 13:15:28.903 [main] INFO o.e.t.h.MainSSLTest Median 32 ms 13:15:28.904 [main] INFO o.e.t.h.MainSSLTest 75th % 66 ms 13:15:28.904 [main] INFO o.e.t.h.MainSSLTest 95th % 2660 ms 13:15:28.904 [main] INFO o.e.t.h.MainSSLTest 99th % 4879 ms 13:15:28.905 [main] INFO o.e.t.h.MainSSLTest 99.9 % 9241 ms - 13:23:15.119 [main] INFO o.e.t.h.MainSSLTest Count 24000 62.99/s 13:23:15.121 [main] INFO o.e.t.h.MainSSLTest 10th % 29 ms 13:23:15.122 [main] INFO o.e.t.h.MainSSLTest 25th % 29 ms 13:23:15.122 [main] INFO o.e.t.h.MainSSLTest Median 32 ms 13:23:15.122 [main] INFO o.e.t.h.MainSSLTest 75th % 61 ms 13:23:15.123 [main] INFO o.e.t.h.MainSSLTest 95th % 2275 ms 13:23:15.123 [main] INFO o.e.t.h.MainSSLTest 99th % 6189 ms 13:23:15.123 [main] INFO o.e.t.h.MainSSLTest 99.9 % 11406 ms quictls branch: openssl-3.0.8+quic 13:34:25.780 [main] INFO o.e.t.h.MainSSLTest Count 24000 64.57/s 13:34:25.783 [main] INFO o.e.t.h.MainSSLTest 10th % 28 ms 13:34:25.783 [main] INFO o.e.t.h.MainSSLTest 25th % 28 ms 13:34:25.783 [main] INFO o.e.t.h.MainSSLTest Median 33 ms 13:34:25.783 [main] INFO o.e.t.h.MainSSLTest 75th % 66 ms 13:34:25.784 [main] INFO o.e.t.h.MainSSLTest 95th % 2642 ms 13:34:25.784 [main] INFO o.e.t.h.MainSSLTest 99th % 4994 ms 13:34:25.784 [main] INFO o.e.t.h.MainSSLTest 99.9 % 7503 ms - 14:08:33.750 [main] INFO o.e.t.h.MainSSLTest Count 24000 63.06/s 14:08:33.753 [main] INFO o.e.t.h.MainSSLTest 10th % 28 ms 14:08:33.753 [main] INFO o.e.t.h.MainSSLTest 25th % 29 ms 14:08:33.754 [main] INFO o.e.t.h.MainSSLTest Median 33 ms 14:08:33.754 [main] INFO o.e.t.h.MainSSLTest 75th % 64 ms 14:08:33.754 [main] INFO o.e.t.h.MainSSLTest 95th % 2904 ms 14:08:33.754 [main] INFO o.e.t.h.MainSSLTest 99th % 5216 ms 14:08:33.755 [main] INFO o.e.t.h.MainSSLTest 99.9 % 8287 ms
Re: Followup on openssl 3.0 note seen in another thread
Hi Shawn, On Fri, May 26, 2023 at 11:17:15PM -0600, Shawn Heisey wrote: > On 5/25/23 09:08, Willy Tarreau wrote: > > The problem definitely is concurrency, so 1000 curl will show nothing > > and will not even match production traffic. You'll need to use a load > > generator that allows you to tweak the TLS resume support, like we do > > with h1load's argument "--tls-reuse". Also I don't know how often the > > recently modified locks are used per server connection and per client > > connection, that's what the SSL guys want to know since they're not able > > to test their changes. > > I finally got a test program together. After trying and failing with the > Jetty HttpClient and Apache HttpClient version 5 (both options that would > have let me do HTTP/2) I got a program together with Apache HttpClient > version 4. I had one version that shelled out to curl, but it ran about ten > times slower. > > I know lots of people are going to have bad things to say about writing a > test in Java. It's the only language where I already know how to write > multi-threaded code. :-) > I would have to spend a bunch of time learning how to > do that in another language. For h2 there's h2load that is available but it doesn't allow you to close and re-open connections. > It fires up X threads, each of which make 1000 consecutive requests to the > URL specified. It records the time in milliseconds for each request, and > when all the threads finish, prints out statistics. These runs are with 24 > threads. I ran it on a different system so that it would not affect CPU > usage on the server running haproxy. Here's the results: > > quictls branch: OpenSSL_1_1_1t+quic > 23:01:19.067 [main] INFO o.e.t.h.MainSSLTest Count 24000 1228.69/s > 23:01:19.069 [main] INFO o.e.t.h.MainSSLTest Median 7562839 ns > 23:01:19.069 [main] INFO o.e.t.h.MainSSLTest 75th % 25138492 ns > 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 95th % 70603313 ns > 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99th % 120502022 ns > 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99.9 % 355829439 ns > > quictls branch: openssl-3.1.0+quic+locks > 22:56:11.457 [main] INFO o.e.t.h.MainSSLTest Count 24000 1267.96/s > 22:56:11.459 [main] INFO o.e.t.h.MainSSLTest Median 6827111 ns > 22:56:11.459 [main] INFO o.e.t.h.MainSSLTest 75th % 23239248 ns > 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 95th % 70625628 ns > 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99th % 129494323 ns > 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99.9 % 307070582 ns > > quictls branch: openssl-3.0.8+quic > 22:59:12.614 [main] INFO o.e.t.h.MainSSLTest Count 24000 1163.24/s > 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest Median 6930268 ns > 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 75th % 26238752 ns > 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 95th % 75464869 ns > 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 99th % 132522508 ns > 22:59:12.617 [main] INFO o.e.t.h.MainSSLTest 99.9 % 445411125 ns > > The stats don't show any kind of smoking gun like I had hoped they would. > Not a lot of difference there. > > Differences in the requests per second are also not huge, but more in line > with what I was expecting. If I can believe those numbers, and I admit that > this kind of micro-benchmark is not the most reliable way to test > performance, it looks like 3.1.0 with the lock fixes is slightly faster than > 1.1.1t. 24 threads might not be enough to really exercise the concurrency > though. The little difference makes me think you've sent your requests over a keep-alive connection, which is fine, but which doesn't stress the TLS stack anymore. Those suffering from TLS performance problems are those with many connections where the sole fact of resuming a TLS session (and even more creating a new one) takes a lot of time. But if your requests all pass over established connections, the TLS stack does nothing anymore, that's just trivial AES crypto that comes for free nowadays. I have updated the ticket there with my measurements. With 24 cores I didn't measure a big difference in new sessions rate since the CPU was dominated by asymmetric crypto (27.4k for 3.1 vs 30.5k for 1.1.1 and 35k for wolfSSL). However with resumed connections the difference was more visible: 48.5k for 3.1, 49.9k for 3.1+locks, 106k for 1.1.1 and 124k for wolfSSL. And there, there's not that much contention (around 15% CPU lost waiting for a lock), which tends to indicate that it's mainly the excess usage of locks (even uncontended) or atomic ops that divides the performance by 2-2.5. For some users it means that if they currently need 4 LB to stay under 80% load in 1.1.1, they will need 8-9 with 3.1 under the same conditions. Another point that I didn't measure there (because it's always a pain to do) is the client mode, which is much more affected. It's less dramatic in 3.1 than in 3.0 but still very impacted. This will affect re-encrypted communications between haproxy and the origin serv
Re: Followup on openssl 3.0 note seen in another thread
On 5/25/23 09:08, Willy Tarreau wrote: The problem definitely is concurrency, so 1000 curl will show nothing and will not even match production traffic. You'll need to use a load generator that allows you to tweak the TLS resume support, like we do with h1load's argument "--tls-reuse". Also I don't know how often the recently modified locks are used per server connection and per client connection, that's what the SSL guys want to know since they're not able to test their changes. I finally got a test program together. After trying and failing with the Jetty HttpClient and Apache HttpClient version 5 (both options that would have let me do HTTP/2) I got a program together with Apache HttpClient version 4. I had one version that shelled out to curl, but it ran about ten times slower. I know lots of people are going to have bad things to say about writing a test in Java. It's the only language where I already know how to write multi-threaded code. I would have to spend a bunch of time learning how to do that in another language. It fires up X threads, each of which make 1000 consecutive requests to the URL specified. It records the time in milliseconds for each request, and when all the threads finish, prints out statistics. These runs are with 24 threads. I ran it on a different system so that it would not affect CPU usage on the server running haproxy. Here's the results: quictls branch: OpenSSL_1_1_1t+quic 23:01:19.067 [main] INFO o.e.t.h.MainSSLTest Count 24000 1228.69/s 23:01:19.069 [main] INFO o.e.t.h.MainSSLTest Median 7562839 ns 23:01:19.069 [main] INFO o.e.t.h.MainSSLTest 75th % 25138492 ns 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 95th % 70603313 ns 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99th % 120502022 ns 23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99.9 % 355829439 ns quictls branch: openssl-3.1.0+quic+locks 22:56:11.457 [main] INFO o.e.t.h.MainSSLTest Count 24000 1267.96/s 22:56:11.459 [main] INFO o.e.t.h.MainSSLTest Median 6827111 ns 22:56:11.459 [main] INFO o.e.t.h.MainSSLTest 75th % 23239248 ns 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 95th % 70625628 ns 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99th % 129494323 ns 22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99.9 % 307070582 ns quictls branch: openssl-3.0.8+quic 22:59:12.614 [main] INFO o.e.t.h.MainSSLTest Count 24000 1163.24/s 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest Median 6930268 ns 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 75th % 26238752 ns 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 95th % 75464869 ns 22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 99th % 132522508 ns 22:59:12.617 [main] INFO o.e.t.h.MainSSLTest 99.9 % 445411125 ns The stats don't show any kind of smoking gun like I had hoped they would. Not a lot of difference there. Differences in the requests per second are also not huge, but more in line with what I was expecting. If I can believe those numbers, and I admit that this kind of micro-benchmark is not the most reliable way to test performance, it looks like 3.1.0 with the lock fixes is slightly faster than 1.1.1t. 24 threads might not be enough to really exercise the concurrency though. I will poke at it a little more tomorrow, trying more threads.
Re: Followup on openssl 3.0 note seen in another thread
чт, 25 мая 2023 г. в 17:11, Willy Tarreau : > On Thu, May 25, 2023 at 07:33:11AM -0600, Shawn Heisey wrote: > > On 3/11/23 22:52, Willy Tarreau wrote: > > > According to the OpenSSL devs, 3.1 should be "4 times better than 3.0", > > > so it could still remain 5-40 times worse than 1.1.1. I intend to run > > > some tests soon on it on a large machine, but preparing tests takes a > > > lot of time and my progress got delayed by the painful bug of last > week. > > > I'll share my findings anywya. > > > > Just noticed that quictls has a special branch for lock changes in 3.1.0: > > > > https://github.com/quictls/openssl/tree/openssl-3.1.0+quic+locks > > Yes, it was made so that the few of us who reported important issues can > retest the impact of the changes. I hope to be able to run a test on a > smaller machine soon. > > > I am not sure how to go about proper testing for performance on this. I > did > > try a very basic "curl a URL 1000 times in bash" test back when 3.1.0 was > > released, but that showed 3.0.8 and 3.1.0 were faster than 1.1.1, so > > concurrency is likely required to see a problem. > > The problem definitely is concurrency, so 1000 curl will show nothing > and will not even match production traffic. You'll need to use a load > I do not think 1000 instances of curl are required. I recall doing some comparative tests (when we evaluated arm64 servers), some really lightweight with profiling enabled were enough to compare "before" and "after". I'll try the JMeter next weekend maybe. > generator that allows you to tweak the TLS resume support, like we do > with h1load's argument "--tls-reuse". Also I don't know how often the > recently modified locks are used per server connection and per client > connection, that's what the SSL guys want to know since they're not able > to test their changes. > > The first test report *before* the changes was published here a month > ago: > > > https://github.com/openssl/openssl/issues/20286#issuecomment-1527869072 > > And now we have to find time to setup a test platform to test this one > in more or less similar conditions (or at least run a before/after). > > Do not hesitate to participate if you see you can provide results > comparing the two quictls-3.1 branches, it will help already. It's even > possible that these efforts do not bring anything yet, we don't know and > that's what they want to know. > > Thanks, > Willy > >
Re: Followup on openssl 3.0 note seen in another thread
On Thu, May 25, 2023 at 07:33:11AM -0600, Shawn Heisey wrote: > On 3/11/23 22:52, Willy Tarreau wrote: > > According to the OpenSSL devs, 3.1 should be "4 times better than 3.0", > > so it could still remain 5-40 times worse than 1.1.1. I intend to run > > some tests soon on it on a large machine, but preparing tests takes a > > lot of time and my progress got delayed by the painful bug of last week. > > I'll share my findings anywya. > > Just noticed that quictls has a special branch for lock changes in 3.1.0: > > https://github.com/quictls/openssl/tree/openssl-3.1.0+quic+locks Yes, it was made so that the few of us who reported important issues can retest the impact of the changes. I hope to be able to run a test on a smaller machine soon. > I am not sure how to go about proper testing for performance on this. I did > try a very basic "curl a URL 1000 times in bash" test back when 3.1.0 was > released, but that showed 3.0.8 and 3.1.0 were faster than 1.1.1, so > concurrency is likely required to see a problem. The problem definitely is concurrency, so 1000 curl will show nothing and will not even match production traffic. You'll need to use a load generator that allows you to tweak the TLS resume support, like we do with h1load's argument "--tls-reuse". Also I don't know how often the recently modified locks are used per server connection and per client connection, that's what the SSL guys want to know since they're not able to test their changes. The first test report *before* the changes was published here a month ago: https://github.com/openssl/openssl/issues/20286#issuecomment-1527869072 And now we have to find time to setup a test platform to test this one in more or less similar conditions (or at least run a before/after). Do not hesitate to participate if you see you can provide results comparing the two quictls-3.1 branches, it will help already. It's even possible that these efforts do not bring anything yet, we don't know and that's what they want to know. Thanks, Willy
Re: Followup on openssl 3.0 note seen in another thread
On 3/11/23 22:52, Willy Tarreau wrote: According to the OpenSSL devs, 3.1 should be "4 times better than 3.0", so it could still remain 5-40 times worse than 1.1.1. I intend to run some tests soon on it on a large machine, but preparing tests takes a lot of time and my progress got delayed by the painful bug of last week. I'll share my findings anywya. Just noticed that quictls has a special branch for lock changes in 3.1.0: https://github.com/quictls/openssl/tree/openssl-3.1.0+quic+locks I am not sure how to go about proper testing for performance on this. I did try a very basic "curl a URL 1000 times in bash" test back when 3.1.0 was released, but that showed 3.0.8 and 3.1.0 were faster than 1.1.1, so concurrency is likely required to see a problem. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
Hi Shawn, On Sat, Mar 11, 2023 at 07:10:30PM -0700, Shawn Heisey wrote: > On 12/14/22 07:15, Willy Tarreau wrote: > > On Wed, Dec 14, 2022 at 07:01:59AM -0700, Shawn Heisey wrote: > > > On 12/14/22 06:07, Willy Tarreau wrote: > > > > By the way, are you running with OpenSSL > > > > 3.0 ? That one is absolutely terrible and makes extreme abuse of > > > > mutexes and locks, to the point that certain workloads were divided > > > > by 2-digit numbers between 1.1.1 and 3.0. It took me one day to > > > > figure that my load generator which was caping at 400 conn/s was in > > > > fact suffering from an accidental build using 3.0 while in 1.1.1 > > > > the perf went back to 75000/s! > > > > > > Is this a current problem with the latest openssl built from source? > > > > Yes and deeper than that actually, there's even a meta-issue to try to > > reference the many reports for massive performance regressions on the > > project: > > A followup to my followup. Time flies! > > I was just reading on the openssl mailing list about what's coming in > version 3.1. The first release highlight is: > > * Refactoring of the OSSL_LIB_CTX code to avoid excessive locking > > Is anyone enough in tune with openssl happenings to know whether that fixes > the issues that Willy was advising me about? Or maybe improves the > situation but doesn't fully resolve it? According to the OpenSSL devs, 3.1 should be "4 times better than 3.0", so it could still remain 5-40 times worse than 1.1.1. I intend to run some tests soon on it on a large machine, but preparing tests takes a lot of time and my progress got delayed by the painful bug of last week. I'll share my findings anywya. > I tried to figure this out for myself based on data in the CHANGES.md file, > but didn't see anything that looked relevant to my very untrained eye. Quite frankly I suspect it's the same for those who write that file as well :-/ > Reading the code wouldn't help, as I am completely clueless when it comes to > encryption code. Same for me. Cheers, Willy
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 07:15, Willy Tarreau wrote: On Wed, Dec 14, 2022 at 07:01:59AM -0700, Shawn Heisey wrote: On 12/14/22 06:07, Willy Tarreau wrote: By the way, are you running with OpenSSL 3.0 ? That one is absolutely terrible and makes extreme abuse of mutexes and locks, to the point that certain workloads were divided by 2-digit numbers between 1.1.1 and 3.0. It took me one day to figure that my load generator which was caping at 400 conn/s was in fact suffering from an accidental build using 3.0 while in 1.1.1 the perf went back to 75000/s! Is this a current problem with the latest openssl built from source? Yes and deeper than that actually, there's even a meta-issue to try to reference the many reports for massive performance regressions on the project: A followup to my followup. Time flies! I was just reading on the openssl mailing list about what's coming in version 3.1. The first release highlight is: * Refactoring of the OSSL_LIB_CTX code to avoid excessive locking Is anyone enough in tune with openssl happenings to know whether that fixes the issues that Willy was advising me about? Or maybe improves the situation but doesn't fully resolve it? I tried to figure this out for myself based on data in the CHANGES.md file, but didn't see anything that looked relevant to my very untrained eye. Reading the code wouldn't help, as I am completely clueless when it comes to encryption code. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On Fri, Dec 16, 2022 at 06:58:33AM -0700, Shawn Heisey wrote: > On 12/16/22 01:59, Shawn Heisey wrote: > > On 12/16/22 00:26, Willy Tarreau wrote: > > > Both work for me using firefox (green flash after reload). > > > > It wasn't working when I tested it. I rebooted for a kernel upgrade and > > it still wasn't working. > > > > And then a while later I was poking around in my zabbix UI and saw the > > green lightning bolt. No idea what changed. Glad it's working, but > > problems that fix themselves annoy me because I usually never learn what > > happened. > > I think I know what happened. > > I was having problems with my pacemaker cluster where it got very confused > about the haproxy resource. I had the haproxy service enabled at boot for > both systems. I have now disabled that in systemd so it's fully under the > control of pacemaker. I'm pretty sure that pacemaker was confused because > it saw the service running on a system where it should have been disabled > and pacemaker didn't start it ... and it decided that was unacceptable and > basically broke the cluster. > > So for a while I had the virtual IP resource on the "lesser" server and the > haproxy resource on the main server. But because I had haproxy enabled at > boot time, it was actually running on both. The haproxy config is the same > between both systems, but the other server was still running a broken > haproxy version. Most of the backends are actually on the better server > accessed by br0 IP address rather than localhost, so the broken haproxy was > still sending them to the right place. This also explains why I was not > seeing traffic with tcpdump filtering on "udp port 443". I have a ways to > go before I've got true HA for my websites. Setting up a database cluster > is going to be challenging, I think. > > I got pacemaker back in working order after I was done with my testing, so > both resources were colocated on the better server and haproxy was not > running on the other one. I think you tried the URLs after I had fixed > pacemaker, and when I saw it working on zabbix, that was also definitely > after I fixed pacemaker. Thanks for sharing your analysis. Indeed, everything makes sense now. > On that UDP bind thing ... I now have three binds defined. The virtual IP, > the IP of the first server, and the IP of the second server. As long as you don't have too many nodes, that's often the simplest thing to do. It requires ip_non_local_bind=1 but that's extremely frequent where haproxy runs. Willy
Re: Followup on openssl 3.0 note seen in another thread
On 12/16/22 01:59, Shawn Heisey wrote: On 12/16/22 00:26, Willy Tarreau wrote: > Both work for me using firefox (green flash after reload). It wasn't working when I tested it. I rebooted for a kernel upgrade and it still wasn't working. And then a while later I was poking around in my zabbix UI and saw the green lightning bolt. No idea what changed. Glad it's working, but problems that fix themselves annoy me because I usually never learn what happened. I think I know what happened. I was having problems with my pacemaker cluster where it got very confused about the haproxy resource. I had the haproxy service enabled at boot for both systems. I have now disabled that in systemd so it's fully under the control of pacemaker. I'm pretty sure that pacemaker was confused because it saw the service running on a system where it should have been disabled and pacemaker didn't start it ... and it decided that was unacceptable and basically broke the cluster. So for a while I had the virtual IP resource on the "lesser" server and the haproxy resource on the main server. But because I had haproxy enabled at boot time, it was actually running on both. The haproxy config is the same between both systems, but the other server was still running a broken haproxy version. Most of the backends are actually on the better server accessed by br0 IP address rather than localhost, so the broken haproxy was still sending them to the right place. This also explains why I was not seeing traffic with tcpdump filtering on "udp port 443". I have a ways to go before I've got true HA for my websites. Setting up a database cluster is going to be challenging, I think. I got pacemaker back in working order after I was done with my testing, so both resources were colocated on the better server and haproxy was not running on the other one. I think you tried the URLs after I had fixed pacemaker, and when I saw it working on zabbix, that was also definitely after I fixed pacemaker. On that UDP bind thing ... I now have three binds defined. The virtual IP, the IP of the first server, and the IP of the second server. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/16/22 00:26, Willy Tarreau wrote: > Both work for me using firefox (green flash after reload). It wasn't working when I tested it. I rebooted for a kernel upgrade and it still wasn't working. And then a while later I was poking around in my zabbix UI and saw the green lightning bolt. No idea what changed. Glad it's working, but problems that fix themselves annoy me because I usually never learn what happened. > You indeed need to > bind to both the native and the virtual IP addresses (you can have the > two on the same "bind" line, delimited by comma). That's the little bit of info that I needed. Now it works the way I was expecting with both IP addresses. I have a lot less experience with UDP than TCP, I wasn't aware of that gotcha. It does make perfect sense now that it's been pointed out. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/16/22 00:01, Willy Tarreau wrote: - if you want to use QUIC, use quictls-1.1.1. Once you have to build something yourself, you definitely don't want to waste your time on the performance-crippled 3.0, and 1.1.1 will change less often than 3.0 so that also means less package updates. - if you want to experiment with QUIC and help developers, running compatibility tests with the latest haproxy master and the latest WolfSSL master could be useful. I just don't know if the maintainers are ready to receive lots of uncoordinated reports yet, I'm aware that they're still in the process of fixing a few basic integration issues that will make things run much smoother soon. Similarly, LibreSSL's QUIC support is very recent (3.6) and few people seem to use LibreSSL, I don't know how well it's supported in distros these days. More tests on this one would probably be nice and may possibly encourage its support. I'd say that I am somewhere in between these two. Helping the devs is not an EXPLICIT goal, but I am already tinkering with this stuff for myself, so it's not a lot of extra effort to be involved here. I think my setup can provide a little bit of useful data and another test environment. Pursuing http3 has been fun. Straying offtopic: I find that being a useful member of open source communities is an awesome experience. For this one I'm not as much use at the code level as I am for other communities. My experience with C was a long time ago ... it was one of my first languages. I spend more time with Bash and Java than anything else these days. Occasionally delve into Perl, which I really like. On the subject of building things myself ... way back in the 90s I used to build all my own Linux kernels, enabling only what I needed, building it into the kernel directly, and optimizing for the specific CPU in the machine. And I tended to build most of the software I used from source as well. These days, some distros have figured out how to do all these things better than I ever could, so I mostly install from apt repos. For really mainstream software, they keep up with recent versions pretty well. For some software, haproxy being one of the most prominent, the distro packages are so far behind what's current that I pretty much have to build it myself if I want useful features. I got started using haproxy with version 1.4, and quickly went to 1.5-dev because I was pursuing the best TLS setup I could get. In those days I wasn't using source repositories, I would download tarballs from 1wt.eu. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 08:40:59PM -0700, Shawn Heisey wrote: > On 12/15/22 09:47, Shawn Heisey wrote: > > The version of curl with http3 support is not available in any of the > > distro repos for my Ubuntu machines, so I found a docker image with it. > > That works in cases where a browser won't switch, but that's because it > > never tries TCP, it goes straight to UDP. The problem doesn't break H3, > > it just breaks a browser's ability to transition from TCP to UDP. > > With the provided patch, the h3 is working well on the machine with this > URL: > > https://http3test.elyograg.org/ > > But it doesn't work correctly on the machine with this URL: > > https://admin.elyograg.org/ Both work for me using firefox (green flash after reload). (...) > TLDR: I also have another oddity. The basement server is part of a > pacemaker cluster which starts a virtual IP and haproxy on one of the > servers, with the server in question having the highest resource placement > setting. Two of the servers in the cluster are bare metal, the third is a > VM running on a third machine, providing a tiebreaker vote so the cluster > works properly without STONITH. Settings prevent the resources from > starting on the VM, and cause haproxy to always be co-located with the > virtual IP. I had to go with a VM because the third machine is running > Ubuntu 22.10 and I couldn't form the cluster with different versions of > pacemaker/corosync/pcsd on that machine compared to the other two. OK that's indeed a significant difference. > If I bind quic4@0.0.0.0:443 then UDP/443 requests to that virtual IP do not > work. But if I bind quic4@192.168.217.170:443 which is that virtual IP, > then UDP/443 requests do work. Expected (even though annoying). It's likely that responses are sent from the native IP address instead of the virtual one. That's actually due to QUIC relying on UDP, and UDP not being well supported by the good old BSD socket API (you can't specify the address to send from). We have a work around for this in 2.8-dev, which comes with other benefits but for now it's better to limit it to setups with less than a few thousands QUIC connections (yours very likely qualifies based on your explanation). I remembered we noted this limitation somehwere but can't find it anymore. Maybe it was just in the announce message. At least we need to make it more prominent (e.g. in the "bind" keyword documentation). You indeed need to bind to both the native and the virtual IP addresses (you can have the two on the same "bind" line, delimited by comma). Hoping this helps, Willy
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 09:47:36AM -0700, Shawn Heisey wrote: > Just got a look at the patch. One line code fixes are awesome. We all love them. Sometimes I even suspect we unconsciously create such bugs to have the pleasure of contemplating these fixes :-) Willy
Re: Followup on openssl 3.0 note seen in another thread
On Fri, Dec 16, 2022 at 01:44:15AM -0500, John Lauro wrote: > What exactly is needed to reproduce the poor performance issue with openssl > 3? I was able to test 20k req/sec with it using k6 to simulate 16k users > over a wan. The k6 box did have openssl1. Probably could have sustained > more, but that's all I need right now. Openssl v1 tested a little faster, > but within 10%. Wasn't trying to max out my tests as that should be over > 4x the needed performance. It mainly depends on the number of CPU cores. What's happening is that in 1.1.0 they silently removed the support for the locking callbacks (these are now ignored) and switched to pthread_mutex instead, without realizing that in case of contention, syscalls would be emitted. Using syscalls for tiny operations is already not good, but it got even worse in the post-SPECTRE era. And in 3.0 they made lots of stuff much more dynamic, with locks everywhere. I measured about 80 lock/unlock sequences for a single request! The problem is that once the load becomes sufficient for threads to compete on a lock, one of them goes into the system and sleeps there. And that's when you start seeing native_queued_spin_lock_slowpath() eat all your CPU. Worse, the time wasted sleeping in the system is so huge compared to the tiny operations that the lock aimed at protecting against, that this time is definitely lost and the system can never recover from this loss because work continues to accumulate. So you can observe good performance until it's too high, at which point you have to significantly lower it to recover. The worst I've seen was the client mode with performance going down from 74k cps to 400 cps on a 24-core machine, i.e. performance divided by almost 200! > Not doing H3, and the backends are send-proxy-v2. > Default libs on Alma linux on arm. > # rpm -qa | grep openssl > openssl-pkcs11-0.4.11-7.el9.aarch64 > xmlsec1-openssl-1.2.29-9.el9.aarch64 > openssl-libs-3.0.1-43.el9_0.aarch64 > openssl-3.0.1-43.el9_0.aarch64 > openssl-devel-3.0.1-43.el9_0.aarch64 > > This is the first box I setup with EL9 and thus openssl-3. Might it only > be an issue when ssl is used to the backends? That's where it has the highest effect, sadly, mostly with renegotiation. If you intend to run at less than a few thousands connection per second it could possibly be OK. Emeric collected some numbers, and we'll soon post them (but bear with us, it takes time to aggregate everything). Also, I don't know if you're using HTTP on the backends, but if so, you should normally mostly benefit from keep-alive and connection reuse. If you want to reproduce these issues, make sure you disable http-reuse (http-reuse never), and disable session resumption on the "server" lines ("no-ssl-reuse"). And never forget to run "perf top" on the machine to see where the CPU is spent. Willy
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 11:39:16PM -0700, Shawn Heisey wrote: > On 12/15/22 21:49, Willy Tarreau wrote: > > There's currently a great momentum around WolfSSL that was already > > adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that > > powers most HTTP/3-compatible agents). Its support on haproxy is > > making fast progress thanks to the efforts on the two sides, and it's > > pleasant to speak to people who care about performance. > > What would be your recommendation right now for a quic-enabled library to > use with haproxy? Are there any choices better than quictls 1.1.1? > Is wolfSSL support far enough along that I could build and try it and have > some hope of success, or should I stick with quictls for now? For now I'd say that quictls 1.1.1 is the best option. 1.1.x doesn't scale very well but doesn't collapse under load like 3.0 at least. And admittedly, support for openssl is proven by now. Other libs are either unmaintainable (BoringSSL with no release cycle and whose API regularly breaks the build in the middle of our stable branches), lagging a bit behind (LibreSSL has not caught up with 1.1.1 on everything and is measurably slower), not supported yet (GnuTLS), or only start to be supported by haproxy (WolfSSL). Thus I'd suggest in this order: - if you don't want to use QUIC and have a small or personal site, use your distro's package, even if it's 3.0, you're unlikely to notice the performance problems. - if you don't want to use QUIC but have a moderate to large site, use openssl 1.1.1, which is easily achieved by staying on the current LTS distros that still provide it. This way you won't need to build and maintain your own package. - if you want to use QUIC, use quictls-1.1.1. Once you have to build something yourself, you definitely don't want to waste your time on the performance-crippled 3.0, and 1.1.1 will change less often than 3.0 so that also means less package updates. - if you want to experiment with QUIC and help developers, running compatibility tests with the latest haproxy master and the latest WolfSSL master could be useful. I just don't know if the maintainers are ready to receive lots of uncoordinated reports yet, I'm aware that they're still in the process of fixing a few basic integration issues that will make things run much smoother soon. Similarly, LibreSSL's QUIC support is very recent (3.6) and few people seem to use LibreSSL, I don't know how well it's supported in distros these days. More tests on this one would probably be nice and may possibly encourage its support. > My websites > certainly aren't anything mission-critical, but there are people that would > be annoyed if I have problems. That's a good reason for staying on quictls for now. That's what we're doing on haproxy.org as well. > Email is more important than the websites, > and that's directly on the Internet in my AWS instance, not going through > haproxy. OK. This part should definitely not be touched under any circumstance. Hoping this helps, Willy
Re: Followup on openssl 3.0 note seen in another thread
What exactly is needed to reproduce the poor performance issue with openssl 3? I was able to test 20k req/sec with it using k6 to simulate 16k users over a wan. The k6 box did have openssl1. Probably could have sustained more, but that's all I need right now. Openssl v1 tested a little faster, but within 10%. Wasn't trying to max out my tests as that should be over 4x the needed performance. Not doing H3, and the backends are send-proxy-v2. Default libs on Alma linux on arm. # rpm -qa | grep openssl openssl-pkcs11-0.4.11-7.el9.aarch64 xmlsec1-openssl-1.2.29-9.el9.aarch64 openssl-libs-3.0.1-43.el9_0.aarch64 openssl-3.0.1-43.el9_0.aarch64 openssl-devel-3.0.1-43.el9_0.aarch64 This is the first box I setup with EL9 and thus openssl-3. Might it only be an issue when ssl is used to the backends? On Thu, Dec 15, 2022 at 11:50 PM Willy Tarreau wrote: > On Thu, Dec 15, 2022 at 08:58:29PM -0700, Shawn Heisey wrote: > > I'm sure the performance issue has been brought to the attention of the > > OpenSSL project ... what did they have to say about the likelihood and > > timeline for providing a fix? > > They're still working on it for 3.1. 3.1-alpha is "less worse" than > 3.0 but still far behind 1.1.1 in our tests. > > > Is there an article or bug filed I can read for more information? > > There's this issue that centralizes the status of the most important > regression reports: > > https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659 > > We've also planned to issue an article to summarize our observations > about this before users are hit too strong, but it will take some > time to collect all info and write it down. But it's definitely a big > problem for users who upgrade to latest LTS distros that shipped 3.0 > without testing it (though I can't blame distros, it's not the package > maintainers' job to run performance tests on what they maintain) :-( > > My personal feeling is that this disaster combined with the stubborn > refusal to support the QUIC crypto API that is mandatory for any > post-2021 HTTP agent basically means that OpenSSL is not part of the > future of web environments and that it's urgent to find alternatives, > just like all other projects are currently seeking. And with http-based > products forced to abandon OpenSSL, it's unlikely that their performance > issues will be relevant in the future so it should get even worse over > time by lack of testing and exposure. It's sad, because before the QUIC > drama, we hoped to spend some time helping them improve their perfomance > by reducing the locking abuse. Now the project has gone too far in the > wrong direction for anything to be doable anymore, and I doubt that > anyone has the energy to fork 1.1.1 and restart from a mostly clean > state. But anyway, a solution must be found for the next batch of LTS > distros so that users can jump from 20.x to 24.x and skip 22.x. > > There's currently a great momentum around WolfSSL that was already > adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that > powers most HTTP/3-compatible agents). Its support on haproxy is > making fast progress thanks to the efforts on the two sides, and it's > pleasant to speak to people who care about performance. I'd bet we'll > find it packaged in a usable state long before OpenSSL finally changes > their mind on QUIC and reaches distros in a usable state. That's a > perfect (though sad) example of the impact of design by committee! > >https://www.openssl.org/policies/omc-bylaws.html#OMC >https://en.wikipedia.org/wiki/Design_by_committee > > Everything was written... > Willy > >
Re: Followup on openssl 3.0 note seen in another thread
On Fri, Dec 16, 2022 at 07:29:23AM +0100, Vincent Bernat wrote: > On 2022-12-16 05:49, Willy Tarreau wrote: > > There's currently a great momentum around WolfSSL that was already > > adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that > > powers most HTTP/3-compatible agents). Its support on haproxy is > > making fast progress thanks to the efforts on the two sides, and it's > > pleasant to speak to people who care about performance. I'd bet we'll > > find it packaged in a usable state long before OpenSSL finally changes > > their mind on QUIC and reaches distros in a usable state. That's a > > perfect (though sad) example of the impact of design by committee! > > It's currently packaged in Debian and Ubuntu. For Ubuntu, it is currently in > universe (no security support). For Debian, there are discussions to not > ship it in the next release due to security concerns, but this is worked on. That's great! I noticed that the lib comes with many build options, and I guess that one difficult aspect will be to figure which ones to enable in the packaged version. I guess that the various projects supporting it will help them figure a reasonable set of default settings that suits everyone (at least all packaged projects). This could constitute a potential solution to have both QUIC support and performance back in future distros. > I'll ask again later when its support is finished in HAProxy if we can > switch to it for Debian/Ubuntu packages. Great, thank you for your help! Most users don't realize how much the success of certain protocol improvements depends on just a bunch of people's willingness to improve the situation for end users ;-) > Next Debian will be using OpenSSL 3.0.0. Ubuntu is using OpenSSL 3.0.0 since > Jammy. Good to know for Debian, thanks! Willy
Re: Followup on openssl 3.0 note seen in another thread
On 12/15/22 21:49, Willy Tarreau wrote: There's currently a great momentum around WolfSSL that was already adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that powers most HTTP/3-compatible agents). Its support on haproxy is making fast progress thanks to the efforts on the two sides, and it's pleasant to speak to people who care about performance. What would be your recommendation right now for a quic-enabled library to use with haproxy? Are there any choices better than quictls 1.1.1? Is wolfSSL support far enough along that I could build and try it and have some hope of success, or should I stick with quictls for now? My websites certainly aren't anything mission-critical, but there are people that would be annoyed if I have problems. Email is more important than the websites, and that's directly on the Internet in my AWS instance, not going through haproxy. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 2022-12-16 05:49, Willy Tarreau wrote: There's currently a great momentum around WolfSSL that was already adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that powers most HTTP/3-compatible agents). Its support on haproxy is making fast progress thanks to the efforts on the two sides, and it's pleasant to speak to people who care about performance. I'd bet we'll find it packaged in a usable state long before OpenSSL finally changes their mind on QUIC and reaches distros in a usable state. That's a perfect (though sad) example of the impact of design by committee! It's currently packaged in Debian and Ubuntu. For Ubuntu, it is currently in universe (no security support). For Debian, there are discussions to not ship it in the next release due to security concerns, but this is worked on. I'll ask again later when its support is finished in HAProxy if we can switch to it for Debian/Ubuntu packages. Next Debian will be using OpenSSL 3.0.0. Ubuntu is using OpenSSL 3.0.0 since Jammy.
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 08:58:29PM -0700, Shawn Heisey wrote: > I'm sure the performance issue has been brought to the attention of the > OpenSSL project ... what did they have to say about the likelihood and > timeline for providing a fix? They're still working on it for 3.1. 3.1-alpha is "less worse" than 3.0 but still far behind 1.1.1 in our tests. > Is there an article or bug filed I can read for more information? There's this issue that centralizes the status of the most important regression reports: https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659 We've also planned to issue an article to summarize our observations about this before users are hit too strong, but it will take some time to collect all info and write it down. But it's definitely a big problem for users who upgrade to latest LTS distros that shipped 3.0 without testing it (though I can't blame distros, it's not the package maintainers' job to run performance tests on what they maintain) :-( My personal feeling is that this disaster combined with the stubborn refusal to support the QUIC crypto API that is mandatory for any post-2021 HTTP agent basically means that OpenSSL is not part of the future of web environments and that it's urgent to find alternatives, just like all other projects are currently seeking. And with http-based products forced to abandon OpenSSL, it's unlikely that their performance issues will be relevant in the future so it should get even worse over time by lack of testing and exposure. It's sad, because before the QUIC drama, we hoped to spend some time helping them improve their perfomance by reducing the locking abuse. Now the project has gone too far in the wrong direction for anything to be doable anymore, and I doubt that anyone has the energy to fork 1.1.1 and restart from a mostly clean state. But anyway, a solution must be found for the next batch of LTS distros so that users can jump from 20.x to 24.x and skip 22.x. There's currently a great momentum around WolfSSL that was already adopted by Apache, Curl, and Ngtcp2 (which is the QUIC stack that powers most HTTP/3-compatible agents). Its support on haproxy is making fast progress thanks to the efforts on the two sides, and it's pleasant to speak to people who care about performance. I'd bet we'll find it packaged in a usable state long before OpenSSL finally changes their mind on QUIC and reaches distros in a usable state. That's a perfect (though sad) example of the impact of design by committee! https://www.openssl.org/policies/omc-bylaws.html#OMC https://en.wikipedia.org/wiki/Design_by_committee Everything was written... Willy
Re: Followup on openssl 3.0 note seen in another thread
On 12/15/22 02:19, Willy Tarreau wrote: I guess you'll get them only while the previous version remains maintained (i.e. use a package from the previous LTS distro). But regardless you'll also need to use executables linked with that version and that's where it can become a pain. When I upgraded my main server from Ubuntu 20 to Ubuntu 22, it still had openssl 1.1.x installed as an unmanaged package not part of any repo. Little by little I got my third-party APT repos updated to jammy. The last holdout was Gitlab, and I got that resolved just a few days ago. Then I was able to remove the 1.1 package. I'm sure the performance issue has been brought to the attention of the OpenSSL project ... what did they have to say about the likelihood and timeline for providing a fix? Is there an article or bug filed I can read for more information? Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/15/22 09:47, Shawn Heisey wrote: The version of curl with http3 support is not available in any of the distro repos for my Ubuntu machines, so I found a docker image with it. That works in cases where a browser won't switch, but that's because it never tries TCP, it goes straight to UDP. The problem doesn't break H3, it just breaks a browser's ability to transition from TCP to UDP. With the provided patch, the h3 is working well on the machine with this URL: https://http3test.elyograg.org/ But it doesn't work correctly on the machine with this URL: https://admin.elyograg.org/ Testing with the curl docker image works on both servers. Testing with https://http3check.net also works with both servers. The configs are not completely identical, but everything related to quic/h3 for those URLs is identical. The only significant difference I have found so far between the two systems is that the one that works is Ubuntu 20.04 with edge kernel 5.15, and the one that doesn't work is Ubuntu 22.04 with edge kernel 6.0. Both have quictls 1.1.1s compiled with exactly the same options, and the same haproxy 2.7 version with the same options -- up to date master with that one line patch. They have different openssl versions, but haproxy should not be using that, it should just be using quictls. The hardware is very different. The one that works is an AWS t3a.large instance, 2 CPUs (linux reports AMD EPYC 7571) and 8GB RAM. The one that doesn't work is a Dell R720xd in my basement with two of the following CPU, each with 12 cores, and 88GB RAM: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz I have been through the configs working out minor differences, which resulted in changes to both configs. Nothing new -- the AWS instance is still working, the basement server isn't. The backends are the same except for the names and IP addresses. H3 used to work on the basement machine, and I couldn't say when it stopped working. I had seen the green lightning bolt on my zabbix install that runs on the basement machine, not sure when it disappeared. I noticed it first on the AWS machine when I was switching quictls versions. I usually update both servers haproxy together, so they probably stopped working about the same time. The patched version works well on one, but not the other. I downgraded the basement to 437fd289f2e32e56498d2d4da63852d483f284ef which should be the 2.7.0 release. That didn't help, so maybe there is something else going on. I believe that haproxy works intimately with kernel code ... could the difference of 5.15 and 6.0 (both with all of ubuntu's patches) be enough to explain this? These are very much homegrown configs. I cobbled together info from the documentation, info obtained on this mailing list, and random articles found with google. I might be doing things substantially different than a true expert would. This is how I configure quictls. If this should be adjusted, I'm open to that. - CONFARGS="--prefix=/opt/quictls enable-tls1_3 no-idea no-mdc2 no-rc5 no-zlib no-ssl3 enable-unit-test no-ssl3-method enable-rfc3779 enable-cms no-capieng threads" if [ "$(uname -i)" == "x86_64" ]; then CONFARGS="${CONFARGS} enable-ec_nistp_64_gcc_128" fi - And here is the latest haproxy -vv: HAProxy version 2.7.0-e557ae-43 2022/12/14 - https://haproxy.org/ Status: stable branch - will stop receiving fixes around Q1 2024. Known bugs: http://www.haproxy.org/bugs/bugs-2.7.0.html Running on: Linux 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 UTC 2022 x86_64 Build options : TARGET = linux-glibc CPU = native CC = cc CFLAGS = -O2 -march=native -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment OPTIONS = USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_QUIC=1 DEBUG = Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT -PCRE2 +PCRE2_JIT +POLL +THREAD -PTHREAD_EMULATION +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -ENGINE +GETADDRINFO +OPENSSL -OPENSSL_WOLFSSL -LUA +ACCEPT4 -CLOSEFROM +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT +QUIC -PROMEX -MEMORY_PROFILING +SHM_OPEN Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=2). Built with OpenSSL version : OpenSSL 1.1.1s+quic 1 Nov 2022 Running on OpenSSL version : OpenSSL 1.1.1s+quic 1 Nov 2022 OpenSSL library supports TLS extensions : yes OpenSSL
Re: Followup on openssl 3.0 note seen in another thread
On 12/15/22 00:58, Amaury Denoyelle wrote: I seem to be able to reach your website with H3 currently. Did you revert to an older version ? Regarding this commit, it rejects requests with invalid headers (with uppercase or non-HTTP tokens in the field name). Have you tried with several browsers and with command-line clients ? Yes, once I found the problem commit, I reverted to the commit just prior, which is why you saw it working. Had to use --3way to apply the patch from your other message to apply to the 2.8-dev master branch. Got that built and deployed. H3 works. Looking forward to the fix coming to 2.7. I did try with firefox, chrome, and a special version of curl. The version of curl with http3 support is not available in any of the distro repos for my Ubuntu machines, so I found a docker image with it. That works in cases where a browser won't switch, but that's because it never tries TCP, it goes straight to UDP. The problem doesn't break H3, it just breaks a browser's ability to transition from TCP to UDP. With the commit just prior to the one that broke H3 in a browser, H3 is a lot more stable than it has been in the past. Before, by clicking around between folders in my webmail, I could eventually (after maybe a dozen clicks) reach a point where the website becomes unresponsive until I shift-reload to get it back to H2 and then reload to have it switch to H3 again. That did not happen with the newer commit. Building with your patch also handles webmail flawlessly. Looks like you meant that I was supposed to apply the patch to the 2.7 master branch, not 2.8-dev. It applied there without --3way, and that also fixes the problem. Just got a look at the patch. One line code fixes are awesome. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 09:20:01AM +0100, Amaury Denoyelle wrote: > On Thu, Dec 15, 2022 at 09:03:18AM +0100, Amaury Denoyelle wrote: > > On Thu, Dec 15, 2022 at 08:58:16AM +0100, Amaury Denoyelle wrote: > > > On Wed, Dec 14, 2022 at 11:20:44PM -0700, Shawn Heisey wrote: > > > > On 12/14/22 21:23, Илья Шипицин wrote: > > > > > Can you try to bisect? > > > > I had made some incorrect assumptions about what's needed to use > > > > bisect. With a little bit of research I figured it out and it was a > > > > LOT easier than I had imagined. > > > > > I suspect that it won't help, browsers tend to remember things in > > > > > their own way > > > > One thing I have learned in my testing is that doing shift-reload on > > > > the page means it will never switch to h3. So I use shift-reload > > > > followed by a couple of regular reloads as a way of resetting what > > > > the browser remembers. That seems to work. > > > > The bisect process only took a few runs to find the problem commit: > > > > 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 is the first bad commit > > > > commit 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 > > > > Author: Amaury Denoyelle > > > > Date: Wed Dec 7 14:31:42 2022 +0100 > > > > BUG/MEDIUM: h3: reject request with invalid header name > > > > [...] > > > I seem to be able to reach your website with H3 currently. Did you > > > revert to an older version ? Regarding this commit, it rejects requests > > > with invalid headers (with uppercase or non-HTTP tokens in the field > > > name). Have you tried with several browsers and with command-line > > > clients ? > > > I will look on my side to see if I missed something. > > With a local instance of nextcloud I am able to reproduce a bug linked > > to this commit with caused the deactivation of H3. I'm investigating on > > it... > The issue seems to be triggered by request with a cookie header. Can you > please apply the following patch on top of the master branch and confirm > me if this resolves your issue ? Thanks. > [...] I'm definitely sure on the fix so I merged my patch. If you can, please give a try to the new master branch and tell me if your issue is resolved. Thanks you for your help on this issue, I really appreciate ! -- Amaury Denoyelle
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 08:56:13AM +0100, Vincent Bernat wrote: > On 2022-12-14 15:15, Willy Tarreau wrote: > > Possibly, yes. It's more efficient in every way from what we can see. > > For users who build themselves (and with QUIC right now you don't have > > a better choice), it should not change anything and will keep robustness. > > For those relying on the distro's package, I don't know if it's possible > > to install the previous distro's package side-by-side, but in any case > > it can start to become a mess to deal with. > > It's possible on Debian and I suspect this is the same for RedHat. However, > you don't get security updates in this case. I guess you'll get them only while the previous version remains maintained (i.e. use a package from the previous LTS distro). But regardless you'll also need to use executables linked with that version and that's where it can become a pain. Willy
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 09:03:18AM +0100, Amaury Denoyelle wrote: > On Thu, Dec 15, 2022 at 08:58:16AM +0100, Amaury Denoyelle wrote: > > On Wed, Dec 14, 2022 at 11:20:44PM -0700, Shawn Heisey wrote: > > > On 12/14/22 21:23, Илья Шипицин wrote: > > > > Can you try to bisect? > > > I had made some incorrect assumptions about what's needed to use > > > bisect. With a little bit of research I figured it out and it was a > > > LOT easier than I had imagined. > > > > I suspect that it won't help, browsers tend to remember things in > > > > their own way > > > One thing I have learned in my testing is that doing shift-reload on > > > the page means it will never switch to h3. So I use shift-reload > > > followed by a couple of regular reloads as a way of resetting what > > > the browser remembers. That seems to work. > > > The bisect process only took a few runs to find the problem commit: > > > 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 is the first bad commit > > > commit 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 > > > Author: Amaury Denoyelle > > > Date: Wed Dec 7 14:31:42 2022 +0100 > > > BUG/MEDIUM: h3: reject request with invalid header name > > > [...] > > I seem to be able to reach your website with H3 currently. Did you > > revert to an older version ? Regarding this commit, it rejects requests > > with invalid headers (with uppercase or non-HTTP tokens in the field > > name). Have you tried with several browsers and with command-line > > clients ? > > I will look on my side to see if I missed something. > With a local instance of nextcloud I am able to reproduce a bug linked > to this commit with caused the deactivation of H3. I'm investigating on > it... The issue seems to be triggered by request with a cookie header. Can you please apply the following patch on top of the master branch and confirm me if this resolves your issue ? Thanks. -- Amaury Denoyelle >From 603a919c8b0cea75516571c27e427960e85fae72 Mon Sep 17 00:00:00 2001 From: Amaury Denoyelle Date: Thu, 15 Dec 2022 09:18:25 +0100 Subject: [PATCH] BUG/MEDIUM: h3: fix cookie header parsing --- src/h3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/h3.c b/src/h3.c index d24b3de5f..10d19e2cd 100644 --- a/src/h3.c +++ b/src/h3.c @@ -544,6 +544,7 @@ static ssize_t h3_headers_to_htx(struct qcs *qcs, const struct buffer *buf, if (isteq(list[hdr_idx].n, ist("cookie"))) { http_cookie_register(list, hdr_idx, &cookie, &last_cookie); + ++hdr_idx; continue; } else if (isteq(list[hdr_idx].n, ist("content-length"))) { -- 2.39.0
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 08:58:16AM +0100, Amaury Denoyelle wrote: > On Wed, Dec 14, 2022 at 11:20:44PM -0700, Shawn Heisey wrote: > > On 12/14/22 21:23, Илья Шипицин wrote: > > > Can you try to bisect? > > I had made some incorrect assumptions about what's needed to use > > bisect. With a little bit of research I figured it out and it was a > > LOT easier than I had imagined. > > > I suspect that it won't help, browsers tend to remember things in > > > their own way > > One thing I have learned in my testing is that doing shift-reload on > > the page means it will never switch to h3. So I use shift-reload > > followed by a couple of regular reloads as a way of resetting what > > the browser remembers. That seems to work. > > The bisect process only took a few runs to find the problem commit: > > 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 is the first bad commit > > commit 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 > > Author: Amaury Denoyelle > > Date: Wed Dec 7 14:31:42 2022 +0100 > > BUG/MEDIUM: h3: reject request with invalid header name > > [...] > I seem to be able to reach your website with H3 currently. Did you > revert to an older version ? Regarding this commit, it rejects requests > with invalid headers (with uppercase or non-HTTP tokens in the field > name). Have you tried with several browsers and with command-line > clients ? > I will look on my side to see if I missed something. With a local instance of nextcloud I am able to reproduce a bug linked to this commit with caused the deactivation of H3. I'm investigating on it... -- Amaury Denoyelle
Re: Followup on openssl 3.0 note seen in another thread
On Wed, Dec 14, 2022 at 11:20:44PM -0700, Shawn Heisey wrote: > On 12/14/22 21:23, Илья Шипицин wrote: > > Can you try to bisect? > I had made some incorrect assumptions about what's needed to use > bisect. With a little bit of research I figured it out and it was a > LOT easier than I had imagined. > > I suspect that it won't help, browsers tend to remember things in > > their own way > One thing I have learned in my testing is that doing shift-reload on > the page means it will never switch to h3. So I use shift-reload > followed by a couple of regular reloads as a way of resetting what > the browser remembers. That seems to work. > The bisect process only took a few runs to find the problem commit: > 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 is the first bad commit > commit 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 > Author: Amaury Denoyelle > Date: Wed Dec 7 14:31:42 2022 +0100 > BUG/MEDIUM: h3: reject request with invalid header name > [...] I seem to be able to reach your website with H3 currently. Did you revert to an older version ? Regarding this commit, it rejects requests with invalid headers (with uppercase or non-HTTP tokens in the field name). Have you tried with several browsers and with command-line clients ? I will look on my side to see if I missed something. -- Amaury Denoyelle
Re: Followup on openssl 3.0 note seen in another thread
On 2022-12-14 15:15, Willy Tarreau wrote: Possibly, yes. It's more efficient in every way from what we can see. For users who build themselves (and with QUIC right now you don't have a better choice), it should not change anything and will keep robustness. For those relying on the distro's package, I don't know if it's possible to install the previous distro's package side-by-side, but in any case it can start to become a mess to deal with. It's possible on Debian and I suspect this is the same for RedHat. However, you don't get security updates in this case.
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 21:23, Илья Шипицин wrote: Can you try to bisect? I had made some incorrect assumptions about what's needed to use bisect. With a little bit of research I figured it out and it was a LOT easier than I had imagined. I suspect that it won't help, browsers tend to remember things in their own way One thing I have learned in my testing is that doing shift-reload on the page means it will never switch to h3. So I use shift-reload followed by a couple of regular reloads as a way of resetting what the browser remembers. That seems to work. The bisect process only took a few runs to find the problem commit: 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 is the first bad commit commit 3ca4223c5e1f18a19dc93b0b09ffdbd295554d46 Author: Amaury Denoyelle Date: Wed Dec 7 14:31:42 2022 +0100 BUG/MEDIUM: h3: reject request with invalid header name Reject request containing invalid header name. This concerns every header containing uppercase letter or a non HTTP token such as a space. For the moment, this kind of errors triggers a connection close. In the future, it should be handled only with a stream reset. To reduce backport surface, this will be implemented in another commit. Thanks to Yuki Mogi from FFRI Security, Inc. for having reported this. This must be backported up to 2.6. (cherry picked from commit d6fb7a0e0f3a79afa1f4b6fc7b62053c3955dc4a) Signed-off-by: Christopher Faulet src/h3.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-)
Re: Followup on openssl 3.0 note seen in another thread
On Thu, Dec 15, 2022 at 10:23:59AM +0600, ??? wrote: > Can you try to bisect? > > I suspect that it won't help, browsers tend to remember things in their own > way That's often the problem we've been facing as well during tests. When a browser decides that your QUIC implementation doesn't work, it seems to store the info "somewhere" for "some time". That's extremely frustrating because restarting usually doesn't change, and there doesn't seem to be anything available to tell them "OK I finished fiddling with my setup, please try again". Willy
Re: Followup on openssl 3.0 note seen in another thread
Can you try to bisect? I suspect that it won't help, browsers tend to remember things in their own way On Thu, Dec 15, 2022, 9:09 AM Shawn Heisey wrote: > On 12/14/22 19:33, Shawn Heisey wrote: > > With quictls 3.0.7 it was working. I will try rebuilding and see > > whether it still does. There was probably an update to haproxy as well > > as changing quictls -- my build script pulls the latest from the 2.7 git > > repo. > > Rebuilding with quictls 3.0.7 didn't change the behavior -- browsers > still don't switch to http as they did before, so the obvious conclusion > is that something changed in haproxy. > > If you would like me to do anything to help troubleshoot, please let me > know. > > This is the simplest test I have. Reloading this page used to switch to > http3: > > https://http3test.elyograg.org/ > > I also built and installed the latest 2.8.0-dev version with quictls > 1.1.1s. It doesn't switch to h3 either. > > Thanks, > Shawn > >
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 19:33, Shawn Heisey wrote: With quictls 3.0.7 it was working. I will try rebuilding and see whether it still does. There was probably an update to haproxy as well as changing quictls -- my build script pulls the latest from the 2.7 git repo. Rebuilding with quictls 3.0.7 didn't change the behavior -- browsers still don't switch to http as they did before, so the obvious conclusion is that something changed in haproxy. If you would like me to do anything to help troubleshoot, please let me know. This is the simplest test I have. Reloading this page used to switch to http3: https://http3test.elyograg.org/ I also built and installed the latest 2.8.0-dev version with quictls 1.1.1s. It doesn't switch to h3 either. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 07:15, Willy Tarreau wrote: Should I switch to quictls 1.1.1 instead? Possibly, yes I did this, and now browsers do not switch to http3. A direct request that forces http3 works, but browsers are not switching to it based on the alt-svc header. Tried both firefox and chrome which have been successful for me in the past. I grabbed a sniffer trace of UDP/443 when I ask for the page in firefox. Here is a wireshark view of that when following the UDP stream: https://www.dropbox.com/s/5sc8ylxt82mn0gf/h3_udp_capture_follow.png?dl=0 That certainly looks to me like a significant amount of two-way communication, but as it's encrypted, I have no idea what it might mean. The browser's console reports that the connection is http/2. With quictls 3.0.7 it was working. I will try rebuilding and see whether it still does. There was probably an update to haproxy as well as changing quictls -- my build script pulls the latest from the 2.7 git repo. Output from haproxy -vv: HAProxy version 2.7.0-e557ae-43 2022/12/14 - https://haproxy.org/ Status: stable branch - will stop receiving fixes around Q1 2024. Known bugs: http://www.haproxy.org/bugs/bugs-2.7.0.html Running on: Linux 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 UTC 2022 x86_64 Build options : TARGET = linux-glibc CPU = native CC = cc CFLAGS = -O2 -march=native -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment OPTIONS = USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_QUIC=1 DEBUG = Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT -PCRE2 +PCRE2_JIT +POLL +THREAD -PTHREAD_EMULATION +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -ENGINE +GETADDRINFO +OPENSSL -OPENSSL_WOLFSSL -LUA +ACCEPT4 -CLOSEFROM +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT +QUIC -PROMEX -MEMORY_PROFILING +SHM_OPEN Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=2). Built with OpenSSL version : OpenSSL 1.1.1s+quic 1 Nov 2022 Running on OpenSSL version : OpenSSL 1.1.1s+quic 1 Nov 2022 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with network namespace support. Support for malloc_trim() is enabled. Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Built with PCRE2 version : 10.34 2019-11-21 PCRE2 library supports JIT : yes Encrypted password support via crypt(3): yes Built with gcc compiler version 9.4.0 Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available multiplexer protocols : (protocols marked as cannot be specified using 'proto' keyword) quic : mode=HTTP side=FE mux=QUIC flags=HTX|NO_UPG|FRAMED h2 : mode=HTTP side=FE|BE mux=H2flags=HTX|HOL_RISK|NO_UPG fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG : mode=HTTP side=FE|BE mux=H1flags=HTX h1 : mode=HTTP side=FE|BE mux=H1flags=HTX|NO_UPG : mode=TCP side=FE|BE mux=PASS flags= none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG Available services : none Available filters : [BWLIM] bwlim-in [BWLIM] bwlim-out [CACHE] cache [COMP] compression [FCGI] fcgi-app [SPOE] spoe [TRACE] trace Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 12:06, Shawn Heisey wrote: I built a gitlab CI config to test out changes to my build/install scripts. I'm having some trouble with that where haproxy is not working right, I'll start a new thread. Turned out that most of those problems were due to docker-related issues. And then I discovered that in my tiny little test config for haproxy I had the bind line for udp/443 all wrong. The following command may be of interest to anyone testing out http3/quic support. It requires that you have docker installed. On Ubuntu that can be installed with "apt install docker.io". sudo docker run --add-host=host.docker.internal:host-gateway --rm ymuski/curl-http3 curl -v -m 4 -s -f -k "https://host.docker.internal/test_file"; --http3 && echo GOOD The curl options configure a 4 second absolute timeout, suppress the usual progress meter that curl shows, turn 4xx or 5xx response codes into a nonzero exit status, and disable certificate validation. Perfect for a CI/CD pipeline. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On 12/14/22 07:15, Willy Tarreau wrote: Should I switch to quictls 1.1.1 instead? Possibly, yes. It's more efficient in every way from what we can see. For users who build themselves (and with QUIC right now you don't have a better choice), it should not change anything and will keep robustness. For those relying on the distro's package, I don't know if it's possible to install the previous distro's package side-by-side, but in any case it can start to become a mess to deal with. Bonus, 1.1.1s compiles noticeably faster than 3.0.7. Install seems about the same, but I figured out how to have the install NOT do the docs, which brought install time down to 3 seconds. I built a gitlab CI config to test out changes to my build/install scripts. I'm having some trouble with that where haproxy is not working right, I'll start a new thread. Thanks, Shawn
Re: Followup on openssl 3.0 note seen in another thread
On Wed, Dec 14, 2022 at 07:01:59AM -0700, Shawn Heisey wrote: > On 12/14/22 06:07, Willy Tarreau wrote: > > By the way, are you running with OpenSSL > > 3.0 ? That one is absolutely terrible and makes extreme abuse of > > mutexes and locks, to the point that certain workloads were divided > > by 2-digit numbers between 1.1.1 and 3.0. It took me one day to > > figure that my load generator which was caping at 400 conn/s was in > > fact suffering from an accidental build using 3.0 while in 1.1.1 > > the perf went back to 75000/s! > > Is this a current problem with the latest openssl built from source? Yes and deeper than that actually, there's even a meta-issue to try to reference the many reports for massive performance regressions on the project: https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659 > I'm > running my 2.7.x installs with quictls 3.0.7, which aside from the QUIC > support should be the same as openssl. Due to new distros progressively moving to 3.0, it's getting more and more exposed. And with 1.1.1 support ending soon, it's going to become a huge problem for many high-performance users. > 400 connections per second is a lot more than I need, but if it's that > inefficient, seems like overall system performance would take a hit even if > it's not completely saturated. My primary server has dual E5-2697 v2 CPUs, > but my mail server is a 2-CPU AWS instance. Actually you're in the same situation as plenty of users who don't need this level of performance and will not necessarily notice the problem until they face a traffic spike and the machine collapses. > Should I switch to quictls 1.1.1 instead? Possibly, yes. It's more efficient in every way from what we can see. For users who build themselves (and with QUIC right now you don't have a better choice), it should not change anything and will keep robustness. For those relying on the distro's package, I don't know if it's possible to install the previous distro's package side-by-side, but in any case it can start to become a mess to deal with. But if you're running at low loads and ideally not exposed to the net, it's unlikely that you'd notice it. What's really happening is that in order to make it more dynamic they've apparently replaced lots of constants with functions that run over lists under locks, so if you're facing very low load, the overhead will remain minimal, but once the load increases and multiple threads need to access the same elements, contention happens. To give you an idea, during a test I measured up to 80 calls to a rwlock for a single HTTP request... Mutexes are so expensive that they should be avoided by all means in low-level functions, and in the worst case should be limited to a single-digit. Here it has no chance to ever recover once a short traffic spike touches the machine. Willy