Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
Is it necessary to load aesni-intel module like modprobe aesni-intel ? Loading this module makes no difference in openssl speed output. openssl speed -elapsed -evp aes-128-cbc before loading aesni-intel module type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-128-cbc 561737.40k 598685.65k 610372.15k 610802.35k 611521.88k after loading aesni-intel module type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-128-cbc 563835.61k 598692.84k 608343.64k 610805.42k 611524.61k -- View this message in context: http://openssl.6102.n7.nabble.com/having-a-lot-of-troubles-trying-to-get-AES-NI-working-tp44285p47201.html Sent from the OpenSSL - User mailing list archive at Nabble.com. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
Interesting... Does this necessarily implicitly implies that the Rijndael cipher was selected as the AES winner because it was also simple enough to be fast, while meeting the security and protection requirements when they initiated the open call for proposals? I didn't realize that the AES and also modern processors were so fast already. I always thought that AES was going to be a fairly slow and compute-intensive process; and so that's why I was trying to make sure that the AES-NI was working. Turns out, I might not even need it. #FML (And I'm pretty sure that some of you guys were probably telling me that, but I didn't realize it then. Now I see the light.) On Sat, Mar 16, 2013 at 1:29 AM, Matthew Hall mh...@mhcomputing.net wrote: On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote: Okay then, here's another one of my infamous dumb questions. If that's the case, then why do we need the AES-NI instruction set? It's far from the first accelerated instruction set of dubious utility. ;) Marketing... etc. Actually, SSL / TLS performance is much more greatly increased by an RSA accelerator. If I were Intel I would have made that first, before AES-NI, because RSA signs and verifies consume a lot more resources and are a lot more vulnerable to DoS than AES. But, of course, RSA is more complex. The tech companies are not trying to make the best possible product, but the best product that's economically feasible, which is a slightly different goal. If it's likely going to be storage and/or network bandwidth limited; wouldn't the improvements made by introducing and incorporating the AES-NI instruction set be kind of wasted in the sense that you can't really use it to the fullest potential anyways? Amdahl's Law: the amount of overall improvement of performance by improving an area is proportional to the amount that area is executed. If the storage/network I/O is going to be your bottleneck/limiting factor, then regardless of whether you have AES-NI or not; you're likely going to get the same answer in terms of speed. Yes! Also, is that why (besides the fact that CBC can't be parallelized) why it doesn't make sense or people really haven't spent too much time or effort into trying to run AES encryption/decryption on GPGPUs? Because it's already faster than anything else is capable of at the moment? Like Erwann said, the memory transfers would likely cost more time than using AES-NI. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
Is there a mailing list that I can ask Rijndael-specific questions? On Sat, Mar 16, 2013 at 3:25 AM, Ewen Chan chan.e...@gmail.com wrote: Interesting... Does this necessarily implicitly implies that the Rijndael cipher was selected as the AES winner because it was also simple enough to be fast, while meeting the security and protection requirements when they initiated the open call for proposals? I didn't realize that the AES and also modern processors were so fast already. I always thought that AES was going to be a fairly slow and compute-intensive process; and so that's why I was trying to make sure that the AES-NI was working. Turns out, I might not even need it. #FML (And I'm pretty sure that some of you guys were probably telling me that, but I didn't realize it then. Now I see the light.) On Sat, Mar 16, 2013 at 1:29 AM, Matthew Hall mh...@mhcomputing.net wrote: On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote: Okay then, here's another one of my infamous dumb questions. If that's the case, then why do we need the AES-NI instruction set? It's far from the first accelerated instruction set of dubious utility. ;) Marketing... etc. Actually, SSL / TLS performance is much more greatly increased by an RSA accelerator. If I were Intel I would have made that first, before AES-NI, because RSA signs and verifies consume a lot more resources and are a lot more vulnerable to DoS than AES. But, of course, RSA is more complex. The tech companies are not trying to make the best possible product, but the best product that's economically feasible, which is a slightly different goal. If it's likely going to be storage and/or network bandwidth limited; wouldn't the improvements made by introducing and incorporating the AES-NI instruction set be kind of wasted in the sense that you can't really use it to the fullest potential anyways? Amdahl's Law: the amount of overall improvement of performance by improving an area is proportional to the amount that area is executed. If the storage/network I/O is going to be your bottleneck/limiting factor, then regardless of whether you have AES-NI or not; you're likely going to get the same answer in terms of speed. Yes! Also, is that why (besides the fact that CBC can't be parallelized) why it doesn't make sense or people really haven't spent too much time or effort into trying to run AES encryption/decryption on GPGPUs? Because it's already faster than anything else is capable of at the moment? Like Erwann said, the memory transfers would likely cost more time than using AES-NI. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes in the flags. On a PC under any OS, get the CPUID, and look for bit 25 of ECX. That's not OpenSSL-related. The use of OPENSSL_ia32cap environment variable allows you to alter the CPUID result (only inside OpenSSL), and alter its behaviour. It's not resistant to a reboot, it's only process dependant. Compare the following results: * OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-128-cbc * openssl speed -elapsed -evp aes-128-cbc -- Erwann ABALEA Le 15/03/2013 04:46, Ewen Chan a écrit : Does it matter whether it's ia32 or ia64 even for an x64 processor? Shouldn't there be some way for me to check whether AES is enabled or being used (other than running a speed test) either in dmesg or /proc/ or with openssl itself? I'm a little confused, and surprised/shocked that there isn't a way to probe the status of whether the AES-NI is a) present and b) enabled/utilized. re: OPENSSL_ia32cap=~0x202 so forgive me for asking lots of dumb questions but that would be $ set OPENSSL_ia32cap=~0x202 $ export OPENSSL_ia32cap correct? And how do I re-enable it without having to reboot the system? What's the value that I should be putting in on the right-hand-side of the equal sign? Your help is much appreciated. Sincerely, Ewen On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org wrote: On Thu, Mar 14, 2013, Ewen Chan wrote: So this is a partial continuation from the discussion thread that I started yesterday in regards to using AES-CBC. I've got an Intel Core i7 3930K that supports AES-NI and I spent the greater part of last night trying to get openssl to work or at least recognize it, but it doesn't seem to want to do that. It it probably recognising it and you don't realise it. OpenSSL 1.0.1 automatically switches to AES-NI at the EVP level without going through an explicit AES-NI ENGINE. You can disable AES-NI detection with the environment variable: OPENSSL_ia32cap=~0x202 You should see a considerable speed up with openssl speed by comparing the two. Steve. -- Dr Stephen N. Henson. OpenSSL project core developer. Commercial tech support now available see: http://www.openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
a...okay. Gotcha. Thanks! On Fri, Mar 15, 2013 at 5:23 AM, Erwann Abalea erwann.aba...@keynectis.com wrote: On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes in the flags. On a PC under any OS, get the CPUID, and look for bit 25 of ECX. That's not OpenSSL-related. The use of OPENSSL_ia32cap environment variable allows you to alter the CPUID result (only inside OpenSSL), and alter its behaviour. It's not resistant to a reboot, it's only process dependant. Compare the following results: OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-128-cbc openssl speed -elapsed -evp aes-128-cbc -- Erwann ABALEA Le 15/03/2013 04:46, Ewen Chan a écrit : Does it matter whether it's ia32 or ia64 even for an x64 processor? Shouldn't there be some way for me to check whether AES is enabled or being used (other than running a speed test) either in dmesg or /proc/ or with openssl itself? I'm a little confused, and surprised/shocked that there isn't a way to probe the status of whether the AES-NI is a) present and b) enabled/utilized. re: OPENSSL_ia32cap=~0x202 so forgive me for asking lots of dumb questions but that would be $ set OPENSSL_ia32cap=~0x202 $ export OPENSSL_ia32cap correct? And how do I re-enable it without having to reboot the system? What's the value that I should be putting in on the right-hand-side of the equal sign? Your help is much appreciated. Sincerely, Ewen On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org wrote: On Thu, Mar 14, 2013, Ewen Chan wrote: So this is a partial continuation from the discussion thread that I started yesterday in regards to using AES-CBC. I've got an Intel Core i7 3930K that supports AES-NI and I spent the greater part of last night trying to get openssl to work or at least recognize it, but it doesn't seem to want to do that. It it probably recognising it and you don't realise it. OpenSSL 1.0.1 automatically switches to AES-NI at the EVP level without going through an explicit AES-NI ENGINE. You can disable AES-NI detection with the environment variable: OPENSSL_ia32cap=~0x202 You should see a considerable speed up with openssl speed by comparing the two. Steve. -- Dr Stephen N. Henson. OpenSSL project core developer. Commercial tech support now available see: http://www.openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
Do these number make sense or seem reasonable? (I'm running a Core i7 3930K that's been OC'd to 4.5 GHz up from the stock 3.2 GHz, running cygwin 1.7.17 on Windows 7 x64 Professional, with 64 GB of DDR3-1600) ~$ OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-cbc for 3s on 16 size blocks: 50880991 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 14919708 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 3870779 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 981188 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 122920 aes-256-cbc's in 3.00s OpenSSL 1.0.1e 11 Feb 2013 built on: Tue Feb 12 15:42:44 CET 2013 options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: i686-pc-cygwin-gcc -D_WINDLL -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 271365.29k 318287.10k 330306.47k 334912.17k 335653.55k ~$ openssl speed -elapsed -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-cbc for 3s on 16 size blocks: 108926706 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 29319700 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 7428178 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 1863275 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 233116 aes-256-cbc's in 3.00s OpenSSL 1.0.1e 11 Feb 2013 built on: Tue Feb 12 15:42:44 CET 2013 options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: i686-pc-cygwin-gcc -D_WINDLL -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-256-cbc 580942.43k 625486.93k 633871.19k 635997.87k 636562.09k Am I to interpret the results as even with the AES-NI disabled, that I'm still abled to process at a minimum, 300 MB/s (for block sizes 64 B) and that with AES-NI; it's only double that (~600-and-change MB/s)? And that for AES-128-CBC, with 8 kiB blocks, that I'm able to process it at 465 MB/s without AES-NI, and 888 MB/s with? That's MUCH faster than I expect it to be (even with AES-NI) and the 888 MB/s is faster than any available storage host-bus interface we've got right now; so I want to make sure that I am not losing my marbles here in trying to make some sense out of this data. Thanks. On Fri, Mar 15, 2013 at 5:23 AM, Erwann Abalea erwann.aba...@keynectis.com wrote: On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes in the flags. On a PC under any OS, get the CPUID, and look for bit 25 of ECX. That's not OpenSSL-related. The use of OPENSSL_ia32cap environment variable allows you to alter the CPUID result (only inside OpenSSL), and alter its behaviour. It's not resistant to a reboot, it's only process dependant. Compare the following results: OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-128-cbc openssl speed -elapsed -evp aes-128-cbc -- Erwann ABALEA Le 15/03/2013 04:46, Ewen Chan a écrit : Does it matter whether it's ia32 or ia64 even for an x64 processor? Shouldn't there be some way for me to check whether AES is enabled or being used (other than running a speed test) either in dmesg or /proc/ or with openssl itself? I'm a little confused, and surprised/shocked that there isn't a way to probe the status of whether the AES-NI is a) present and b) enabled/utilized. re: OPENSSL_ia32cap=~0x202 so forgive me for asking lots of dumb questions but that would be $ set OPENSSL_ia32cap=~0x202 $ export OPENSSL_ia32cap correct? And how do I re-enable it without having to reboot the system? What's the value that I should be putting in on the right-hand-side of the equal sign? Your help is much appreciated. Sincerely, Ewen On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org wrote: On Thu, Mar 14, 2013, Ewen Chan wrote: So this is a partial continuation from the discussion thread that I started yesterday in regards to using AES-CBC.
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
On Sat, Mar 16, 2013 at 12:06:07AM -0400, Ewen Chan wrote: That's MUCH faster than I expect it to be (even with AES-NI) and the 888 MB/s is faster than any available storage host-bus interface we've got right now; so I want to make sure that I am not losing my marbles here in trying to make some sense out of this data. Thanks. Without AES-NI other still quite fast instructions are used from SSE, SSE2, etc. So it is a boost but not the be-all end-all boost. In most cases the Core i7 will clobber most any storage device bar the very most costly SSDs. It's an extremely powerful CPU. If this isn't screaming fast enough, use GCM instead of CBC, and enjoy the multicore awesome thereof. ;) As fast at it is, 10 GBE will still do 1280 MiB / sec. But it should fill up the 8 Gbps FC with no issues. Matthew. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
Okay then, here's another one of my infamous dumb questions. If that's the case, then why do we need the AES-NI instruction set? If it's likely going to be storage and/or network bandwidth limited; wouldn't the improvements made by introducing and incorporating the AES-NI instruction set be kind of wasted in the sense that you can't really use it to the fullest potential anyways? If the storage/network I/O is going to be your bottleneck/limiting factor, then regardless of whether you have AES-NI or not; you're likely going to get the same answer in terms of speed. (I actually copied the input file onto my SSD so as to try and eliminate that as the bottleneck; and then write the output to the same drive.) (haha...it feels like that the more questions I ask, the more confused I get...lol...) Also, is that why (besides the fact that CBC can't be parallelized) why it doesn't make sense or people really haven't spent too much time or effort into trying to run AES encryption/decryption on GPGPUs? Because it's already faster than anything else is capable of at the moment? On Sat, Mar 16, 2013 at 1:05 AM, Matthew Hall mh...@mhcomputing.net wrote: On Sat, Mar 16, 2013 at 12:06:07AM -0400, Ewen Chan wrote: That's MUCH faster than I expect it to be (even with AES-NI) and the 888 MB/s is faster than any available storage host-bus interface we've got right now; so I want to make sure that I am not losing my marbles here in trying to make some sense out of this data. Thanks. Without AES-NI other still quite fast instructions are used from SSE, SSE2, etc. So it is a boost but not the be-all end-all boost. In most cases the Core i7 will clobber most any storage device bar the very most costly SSDs. It's an extremely powerful CPU. If this isn't screaming fast enough, use GCM instead of CBC, and enjoy the multicore awesome thereof. ;) As fast at it is, 10 GBE will still do 1280 MiB / sec. But it should fill up the 8 Gbps FC with no issues. Matthew. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org
Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working
On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote: Okay then, here's another one of my infamous dumb questions. If that's the case, then why do we need the AES-NI instruction set? It's far from the first accelerated instruction set of dubious utility. ;) Marketing... etc. Actually, SSL / TLS performance is much more greatly increased by an RSA accelerator. If I were Intel I would have made that first, before AES-NI, because RSA signs and verifies consume a lot more resources and are a lot more vulnerable to DoS than AES. But, of course, RSA is more complex. The tech companies are not trying to make the best possible product, but the best product that's economically feasible, which is a slightly different goal. If it's likely going to be storage and/or network bandwidth limited; wouldn't the improvements made by introducing and incorporating the AES-NI instruction set be kind of wasted in the sense that you can't really use it to the fullest potential anyways? Amdahl's Law: the amount of overall improvement of performance by improving an area is proportional to the amount that area is executed. If the storage/network I/O is going to be your bottleneck/limiting factor, then regardless of whether you have AES-NI or not; you're likely going to get the same answer in terms of speed. Yes! Also, is that why (besides the fact that CBC can't be parallelized) why it doesn't make sense or people really haven't spent too much time or effort into trying to run AES encryption/decryption on GPGPUs? Because it's already faster than anything else is capable of at the moment? Like Erwann said, the memory transfers would likely cost more time than using AES-NI. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager majord...@openssl.org