On Tue, Dec 26, 2017 at 08:43:43AM +0000, Lucas Rolff wrote:
> Hi guys,
> 
> I’m currently performing a few tests on haproxy and nginx to determine the 
> best software to terminate SSL early in a hosting stack, and also perform 
> some load balancing to varnish machines.
> 
> I’d love to use haproxy for this setup, since haproxy does one thing really 
> great – load balancing and the way it determines if backends are down.
> 
> However, one issue I’m facing is the SSL termination performance, on a single 
> core I manage to terminate 21000 connections per second on nginx, but only 
> 748 connections per second in haproxy.

748 looks like what a single core on a VM can achieve in terms of private key
computation with rsa 2048 certs. You can confirm this by running the following
command in your vm:

openssl speed rsa2048.

21000 is too high to be key computation only. 

> 
> They’re using the exact same cipher suite 
> (TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256) to minimize the SSL overhead, I decided 
> to go for AES128 since the security itself isn’t super important, but rather 
> just that things are somewhat encrypted (mainly serving static files or 
> non-sensitive content).
> 
> I’m testing with a single apache benchmark client (actually from the 
> hypervisor on where I have my VM running, so the network latency is minimal 
> to rule out any networking being the cause to get the highest possible 
> numbers.

Can you please share the exact ab command you are using for your tests ?

> 
> I generated a flame graph for both haproxy and nginx using `perf` tool
> 
> Haproxy flame graph can be found here: 
> https://snaps.trollcdn.com/sadiZsJd96twAez0GUiWJdDiEbwsRPWUxJ3sRskLG4.svg
> 
> Nginx flame graph can be found here: 
> https://snaps.trollcdn.com/P7PVyDkjhsxbsXCmK6bzVeqWsHHwnOxRucnCYG084f.svg
> 
> What I find odd, is that in haproxy you’ll libcrypto.so.1.0.2k with 81k 
> samples, but the function right below (unknown) only got 8.3k samples, where 
> in nginx the gap is *a lot* smaller, and I’ve still not really figured out 
> what actually happens in haproxy that causes this gap.
> 
> However, my main concern is the fact that terminating SSL, nginx performs 28 
> times better.
> 
> I’ve tried running haproxy with both 10 threads, or 10 processes on a 12 core 
> machine – pinning each thread or process to a specific core, and putting RX 
> and TX queues on individual cores as well to ensure that load would be evenly 
> distributed.
> 
> Doing the same with nginx, it still reveals a 5.5k requests per second on 
> haproxy, but 125.000 requests per second on nginx (22 times difference).
> I got absolutely best performance on haproxy by using processes over threads 
> – with the processes, it’s not maxing out on the CPU but it is with the 
> threads, so not sure why this happens either.
> 
> Now, since nginx can serve static files directly, I wanted to replicate the 
> same in haproxy so I wouldn’t have to have a backend that would then do a 
> connection in the backend, since this could surely degrade the overall 
> requests per second on haproxy.
> 
> I did this by using an errorfile 200 /etc/haproxy/errorfiles/200.http to just 
> serve a file directly on the frontend.
> 
> My haproxy config looks like this: 
> https://gist.github.com/lucasRolff/36fc84ac44aad559c1d43ab6f30237c8

This configuration has no backend, so each request will be replied to with a 503
response containing a connection: close header, which means each request will
lead to a key computation. 


> 
> Do anyone have any suggestions or maybe insight into why haproxy seems to be 
> terminating SSL connections at a way lower rate per second, than for example 
> nginx? Is there any missing functionality in haproxy that isn’t available, 
> and thus causing nginx to succeed in terms of the performance/scalability for 
> terminations?
> 
> There’s many things I absolutely love about haproxy, but if there’s a 22-28x 
> difference in how many SSL terminations it can handle per second, then we’re 
> talking about a lot of added hardware to be able to handle, let’s say 500k 
> requests per second.
> 
I don't think we are comparing the same values here, there definitely isn't a
22-28 time difference.


> The VM has AES-NI available as well.
> 

I think that only applies to ciphering traffic, not key exchange.

cheers,
Jérôme

Reply via email to