Hello,

First time posting to this list, so thanks in advance-- you guys make some 
good software!  I also posted this on the forums but figured this might 
get more coverage.

So, I'll jump right into it-- I've got two versions of OpenSSL 0.9.8* 
running on my Intel [Xeon/Westmere] system. One has been patched to 
leverage the new Intel AES-NI instructions, while the other does not. I've 
verified that these two work appropriately.

Running

"openssl speed --engine aesni -evp aes-128-cbc"

on the optimized executable/library shows significant performance boosts 
as compared to running it without specifying --engine aesni.

So, as a study, I want to see how OpenVPN would benefit from the same 
thing. I've got two versions-- one that utilizes the AES-NI version of 
OpenSSL, the other does not. (This is basically the same thing as 
specifying "--engine aesni" on the command line of OpenVPN as opposed to 
specifying no engine.) I have created a OpenVPN server/client connection 
between two computers behind a subnet at a lab here at work, and I've been 
trying to drive traffic using iperf to show a performance increase with 
the AES-NI version-- but I simply cannot see any performance boost 
whatsoever.

I was able to run a simple loopback self-test using the "--test-crypto" 
command line option built in to OpenVPN along with the "time" command on 
Linux. Basically, I measured how quickly the two could generate 100000 
random packets, starting at 1 byte and going all the way up to 100000 
bytes, and loop them back through the encryption and decryption algorithms 
using aes-128-cbc. Granted, the "--test-crypto" command is doing a number 
of things besides running aes-128-cbc [such as generating random numbers 
to fill packets with, etc] but it indicates that we're getting somewhere.

> time openvpn --test-crypto --secret key-regular --verb 0 --tun-mtu 
100000 --cipher aes-128-cbc

real 4m58.195s
user 4m57.363s
sys 0m0.000s

> time openvpn --test-crypto --secret key-aesni --verb 0 --tun-mtu 100000 
--cipher aes-128-cbc --engine aesni

real 4m14.874s
user 4m14.160s
sys 0m0.000s

The optimized AES-NI version finishes about 44 seconds faster than the 
non-optimized version.

Other than that, I have been able to notice no discernible difference 
using aes-128-cbc, aes-192-cbc, or aes-256-cbc when the engine is engaged 
as opposed to not engaged while trying to drive traffic over a network 
between a client and server [both i7 boxes] with iperf. I've even tried 
running iperf with multiple threads (using the -P option) to no avail.

Does anybody know of a way or perhaps have a clever idea for me to stress 
test this system to show how AES NI can really juice the performance? I 
know the RSA key exchange only happens at the beginning, and to my 
understanding [which is far from perfect] everything in between should 
really be contingent on one of two things:

a) network speed/infastructure/latency
b) how quickly the server can encrypt the data to send it across, etc.

>From what I can gather, OpenVPN is single threaded, which already sounds 
like a bottleneck in and of itself-- so what about running multiple copies 
of the OpenVPN server and having a client/clients attached to each server, 
all generating iperf traffic at once? I'm just not sure if that would 
help. I'm a little stuck, and before I spend a ton more time trying 
different things, I thought it would be smart to consult the experts.

It would be really great to realize a situation that allows OpenVPN to 
significantly benefit from the builtin AES-NI booster instructions on this 
architecture. However, I'm afraid that the there are other bottlenecks not 
allowing the AES-NI booster instructions to shine, such as kernel-user 
level interactions, network link speed, etc. Thus-- your help is greatly 
appreciated in helping me better understand what might be going on.

I tried muddling with the loopback feature a bit to remove any and all 
network variables, but that didn't get me anywhere. So far the 
"--test-crypto" has been the best indication that OpenVPN can indeed 
benefit from OpenSSL crypto accelerators.

Thank you so much--

Nick Lindberg
IBM Hardware Architecture Performance
414-258-1639

Reply via email to