On Tue, Dec 26, 2017 at 08:43:43AM +0000, Lucas Rolff wrote: > Hi guys, > > I’m currently performing a few tests on haproxy and nginx to determine the > best software to terminate SSL early in a hosting stack, and also perform > some load balancing to varnish machines. > > I’d love to use haproxy for this setup, since haproxy does one thing really > great – load balancing and the way it determines if backends are down. > > However, one issue I’m facing is the SSL termination performance, on a single > core I manage to terminate 21000 connections per second on nginx, but only > 748 connections per second in haproxy.
748 looks like what a single core on a VM can achieve in terms of private key computation with rsa 2048 certs. You can confirm this by running the following command in your vm: openssl speed rsa2048. 21000 is too high to be key computation only. > > They’re using the exact same cipher suite > (TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256) to minimize the SSL overhead, I decided > to go for AES128 since the security itself isn’t super important, but rather > just that things are somewhat encrypted (mainly serving static files or > non-sensitive content). > > I’m testing with a single apache benchmark client (actually from the > hypervisor on where I have my VM running, so the network latency is minimal > to rule out any networking being the cause to get the highest possible > numbers. Can you please share the exact ab command you are using for your tests ? > > I generated a flame graph for both haproxy and nginx using `perf` tool > > Haproxy flame graph can be found here: > https://snaps.trollcdn.com/sadiZsJd96twAez0GUiWJdDiEbwsRPWUxJ3sRskLG4.svg > > Nginx flame graph can be found here: > https://snaps.trollcdn.com/P7PVyDkjhsxbsXCmK6bzVeqWsHHwnOxRucnCYG084f.svg > > What I find odd, is that in haproxy you’ll libcrypto.so.1.0.2k with 81k > samples, but the function right below (unknown) only got 8.3k samples, where > in nginx the gap is *a lot* smaller, and I’ve still not really figured out > what actually happens in haproxy that causes this gap. > > However, my main concern is the fact that terminating SSL, nginx performs 28 > times better. > > I’ve tried running haproxy with both 10 threads, or 10 processes on a 12 core > machine – pinning each thread or process to a specific core, and putting RX > and TX queues on individual cores as well to ensure that load would be evenly > distributed. > > Doing the same with nginx, it still reveals a 5.5k requests per second on > haproxy, but 125.000 requests per second on nginx (22 times difference). > I got absolutely best performance on haproxy by using processes over threads > – with the processes, it’s not maxing out on the CPU but it is with the > threads, so not sure why this happens either. > > Now, since nginx can serve static files directly, I wanted to replicate the > same in haproxy so I wouldn’t have to have a backend that would then do a > connection in the backend, since this could surely degrade the overall > requests per second on haproxy. > > I did this by using an errorfile 200 /etc/haproxy/errorfiles/200.http to just > serve a file directly on the frontend. > > My haproxy config looks like this: > https://gist.github.com/lucasRolff/36fc84ac44aad559c1d43ab6f30237c8 This configuration has no backend, so each request will be replied to with a 503 response containing a connection: close header, which means each request will lead to a key computation. > > Do anyone have any suggestions or maybe insight into why haproxy seems to be > terminating SSL connections at a way lower rate per second, than for example > nginx? Is there any missing functionality in haproxy that isn’t available, > and thus causing nginx to succeed in terms of the performance/scalability for > terminations? > > There’s many things I absolutely love about haproxy, but if there’s a 22-28x > difference in how many SSL terminations it can handle per second, then we’re > talking about a lot of added hardware to be able to handle, let’s say 500k > requests per second. > I don't think we are comparing the same values here, there definitely isn't a 22-28 time difference. > The VM has AES-NI available as well. > I think that only applies to ciphering traffic, not key exchange. cheers, Jérôme

