Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-11-06 Thread sarava.sars
Is it necessary to load aesni-intel module like modprobe aesni-intel ?
Loading this module makes no difference in openssl speed output.

openssl speed -elapsed -evp aes-128-cbc

before loading aesni-intel module

type 16 bytes 64 bytes256 bytes   1024 bytes   8192
bytes
aes-128-cbc 561737.40k   598685.65k   610372.15k   610802.35k  
611521.88k

after loading aesni-intel module

type 16 bytes 64 bytes256 bytes   1024 bytes   8192
bytes
aes-128-cbc 563835.61k   598692.84k   608343.64k   610805.42k  
611524.61k




--
View this message in context: 
http://openssl.6102.n7.nabble.com/having-a-lot-of-troubles-trying-to-get-AES-NI-working-tp44285p47201.html
Sent from the OpenSSL - User mailing list archive at Nabble.com.
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-16 Thread Ewen Chan
Interesting...

Does this necessarily implicitly implies that the Rijndael cipher was
selected as the AES winner because it was also simple enough to be
fast, while meeting the security and protection requirements when they
initiated the open call for proposals?

I didn't realize that the AES and also modern processors were so fast
already. I always thought that AES was going to be a fairly slow and
compute-intensive process; and so that's why I was trying to make sure
that the AES-NI was working. Turns out, I might not even need it. #FML

(And I'm pretty sure that some of you guys were probably telling me
that, but I didn't realize it then. Now I see the light.)

On Sat, Mar 16, 2013 at 1:29 AM, Matthew Hall mh...@mhcomputing.net wrote:
 On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote:
 Okay then, here's another one of my infamous dumb questions.

 If that's the case, then why do we need the AES-NI instruction set?

 It's far from the first accelerated instruction set of dubious utility. ;)

 Marketing... etc.

 Actually, SSL / TLS performance is much more greatly increased by an RSA
 accelerator. If I were Intel I would have made that first, before AES-NI,
 because RSA signs and verifies consume a lot more resources and are a lot more
 vulnerable to DoS than AES. But, of course, RSA is more complex.

 The tech companies are not trying to make the best possible product, but the
 best product that's economically feasible, which is a slightly different goal.

 If it's likely going to be storage and/or network bandwidth limited;
 wouldn't the improvements made by introducing and incorporating the
 AES-NI instruction set be kind of wasted in the sense that you can't
 really use it to the fullest potential anyways?

 Amdahl's Law: the amount of overall improvement of performance by improving an
 area is proportional to the amount that area is executed.

 If the storage/network I/O is going to be your bottleneck/limiting
 factor, then regardless of whether you have AES-NI or not; you're
 likely going to get the same answer in terms of speed.

 Yes!

 Also, is that why (besides the fact that CBC can't be parallelized)
 why it doesn't make sense or people really haven't spent too much time
 or effort into trying to run AES encryption/decryption on GPGPUs?
 Because it's already faster than anything else is capable of at the
 moment?

 Like Erwann said, the memory transfers would likely cost more time than using
 AES-NI.
 __
 OpenSSL Project http://www.openssl.org
 User Support Mailing Listopenssl-users@openssl.org
 Automated List Manager   majord...@openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-16 Thread Ewen Chan
Is there a mailing list that I can ask Rijndael-specific questions?

On Sat, Mar 16, 2013 at 3:25 AM, Ewen Chan chan.e...@gmail.com wrote:
 Interesting...

 Does this necessarily implicitly implies that the Rijndael cipher was
 selected as the AES winner because it was also simple enough to be
 fast, while meeting the security and protection requirements when they
 initiated the open call for proposals?

 I didn't realize that the AES and also modern processors were so fast
 already. I always thought that AES was going to be a fairly slow and
 compute-intensive process; and so that's why I was trying to make sure
 that the AES-NI was working. Turns out, I might not even need it. #FML

 (And I'm pretty sure that some of you guys were probably telling me
 that, but I didn't realize it then. Now I see the light.)

 On Sat, Mar 16, 2013 at 1:29 AM, Matthew Hall mh...@mhcomputing.net wrote:
 On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote:
 Okay then, here's another one of my infamous dumb questions.

 If that's the case, then why do we need the AES-NI instruction set?

 It's far from the first accelerated instruction set of dubious utility. ;)

 Marketing... etc.

 Actually, SSL / TLS performance is much more greatly increased by an RSA
 accelerator. If I were Intel I would have made that first, before AES-NI,
 because RSA signs and verifies consume a lot more resources and are a lot 
 more
 vulnerable to DoS than AES. But, of course, RSA is more complex.

 The tech companies are not trying to make the best possible product, but the
 best product that's economically feasible, which is a slightly different 
 goal.

 If it's likely going to be storage and/or network bandwidth limited;
 wouldn't the improvements made by introducing and incorporating the
 AES-NI instruction set be kind of wasted in the sense that you can't
 really use it to the fullest potential anyways?

 Amdahl's Law: the amount of overall improvement of performance by improving 
 an
 area is proportional to the amount that area is executed.

 If the storage/network I/O is going to be your bottleneck/limiting
 factor, then regardless of whether you have AES-NI or not; you're
 likely going to get the same answer in terms of speed.

 Yes!

 Also, is that why (besides the fact that CBC can't be parallelized)
 why it doesn't make sense or people really haven't spent too much time
 or effort into trying to run AES encryption/decryption on GPGPUs?
 Because it's already faster than anything else is capable of at the
 moment?

 Like Erwann said, the memory transfers would likely cost more time than using
 AES-NI.
 __
 OpenSSL Project http://www.openssl.org
 User Support Mailing Listopenssl-users@openssl.org
 Automated List Manager   majord...@openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Erwann Abalea
On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes 
in the flags.

On a PC under any OS, get the CPUID, and look for bit 25 of ECX.
That's not OpenSSL-related.


The use of OPENSSL_ia32cap environment variable allows you to alter the 
CPUID result (only inside OpenSSL), and alter its behaviour. It's not 
resistant to a reboot, it's only process dependant.


Compare the following results:

 * OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp
   aes-128-cbc
 * openssl speed -elapsed -evp aes-128-cbc


--
Erwann ABALEA

Le 15/03/2013 04:46, Ewen Chan a écrit :

Does it matter whether it's ia32 or ia64 even for an x64 processor?

Shouldn't there be some way for me to check whether AES is enabled or
being used (other than running a speed test) either in dmesg or /proc/
or with openssl itself? I'm a little confused, and surprised/shocked
that there isn't a way to probe the status of whether the AES-NI is a)
present and b) enabled/utilized.

re: OPENSSL_ia32cap=~0x202
so forgive me for asking lots of dumb questions but that would be
$ set OPENSSL_ia32cap=~0x202
$ export OPENSSL_ia32cap

correct?

And how do I re-enable it without having to reboot the system? What's
the value that I should be putting in on the right-hand-side of the
equal sign?

Your help is much appreciated.

Sincerely,
Ewen

On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org wrote:

On Thu, Mar 14, 2013, Ewen Chan wrote:


So this is a partial continuation from the discussion thread that I
started yesterday in regards to using AES-CBC.

I've got an Intel Core i7 3930K that supports AES-NI and I spent the
greater part of last night trying to get openssl to work or at least
recognize it, but it doesn't seem to want to do that.


It it probably recognising it and you don't realise it. OpenSSL 1.0.1
automatically switches to AES-NI at the EVP level without going through an
explicit AES-NI ENGINE.

You can disable AES-NI detection with the environment variable:

OPENSSL_ia32cap=~0x202

You should see a considerable speed up with openssl speed by comparing the
two.

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org

__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org





Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Ewen Chan
a...okay. Gotcha.

Thanks!

On Fri, Mar 15, 2013 at 5:23 AM, Erwann Abalea
erwann.aba...@keynectis.com wrote:
 On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes in
 the flags.
 On a PC under any OS, get the CPUID, and look for bit 25 of ECX.
 That's not OpenSSL-related.


 The use of OPENSSL_ia32cap environment variable allows you to alter the
 CPUID result (only inside OpenSSL), and alter its behaviour. It's not
 resistant to a reboot, it's only process dependant.

 Compare the following results:

 OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-128-cbc
 openssl speed -elapsed -evp aes-128-cbc


 --
 Erwann ABALEA

 Le 15/03/2013 04:46, Ewen Chan a écrit :

 Does it matter whether it's ia32 or ia64 even for an x64 processor?
 Shouldn't there be some way for me to check whether AES is enabled or
 being used (other than running a speed test) either in dmesg or /proc/
 or with openssl itself? I'm a little confused, and surprised/shocked
 that there isn't a way to probe the status of whether the AES-NI is a)
 present and b) enabled/utilized.
 re: OPENSSL_ia32cap=~0x202
 so forgive me for asking lots of dumb questions but that would be
 $ set OPENSSL_ia32cap=~0x202
 $ export OPENSSL_ia32cap
 correct?
 And how do I re-enable it without having to reboot the system? What's
 the value that I should be putting in on the right-hand-side of the
 equal sign?
 Your help is much appreciated.
 Sincerely,
 Ewen
 On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org
 wrote:

 On Thu, Mar 14, 2013, Ewen Chan wrote:

 So this is a partial continuation from the discussion thread that I
 started yesterday in regards to using AES-CBC.
 I've got an Intel Core i7 3930K that supports AES-NI and I spent the
 greater part of last night trying to get openssl to work or at least
 recognize it, but it doesn't seem to want to do that.

 It it probably recognising it and you don't realise it. OpenSSL 1.0.1
 automatically switches to AES-NI at the EVP level without going through an
 explicit AES-NI ENGINE.
 You can disable AES-NI detection with the environment variable:
 OPENSSL_ia32cap=~0x202
 You should see a considerable speed up with openssl speed by comparing the
 two.
 Steve.
 --
 Dr Stephen N. Henson. OpenSSL project core developer.
 Commercial tech support now available see: http://www.openssl.org
 __
 OpenSSL Project http://www.openssl.org
 User Support Mailing Listopenssl-users@openssl.org
 Automated List Manager   majord...@openssl.org

 __
 OpenSSL Project http://www.openssl.org
 User Support Mailing Listopenssl-users@openssl.org
 Automated List Manager   majord...@openssl.org


__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Ewen Chan
Do these number make sense or seem reasonable?

(I'm running a Core i7 3930K that's been OC'd to 4.5 GHz up from the
stock 3.2 GHz, running cygwin 1.7.17 on Windows 7 x64 Professional,
with 64 GB of DDR3-1600)

~$ OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 50880991 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 14919708 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 3870779 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 981188 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 122920 aes-256-cbc's in 3.00s
OpenSSL 1.0.1e 11 Feb 2013
built on: Tue Feb 12 15:42:44 CET 2013
options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: i686-pc-cygwin-gcc -D_WINDLL -DOPENSSL_PIC -DZLIB
-DOPENSSL_THREADS  -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -DL_ENDIAN
-fomit-frame-pointer -O3 -march=i486 -Wall -DOPENSSL_BN_ASM_PART_WORDS
-DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m
-DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
-DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-256-cbc 271365.29k   318287.10k   330306.47k   334912.17k   335653.55k

~$ openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 108926706 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 29319700 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 7428178 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1863275 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 233116 aes-256-cbc's in 3.00s
OpenSSL 1.0.1e 11 Feb 2013
built on: Tue Feb 12 15:42:44 CET 2013
options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: i686-pc-cygwin-gcc -D_WINDLL -DOPENSSL_PIC -DZLIB
-DOPENSSL_THREADS  -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -DL_ENDIAN
-fomit-frame-pointer -O3 -march=i486 -Wall -DOPENSSL_BN_ASM_PART_WORDS
-DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m
-DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
-DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-256-cbc 580942.43k   625486.93k   633871.19k   635997.87k   636562.09k

Am I to interpret the results as even with the AES-NI disabled, that
I'm still abled to process at a minimum, 300 MB/s (for block sizes 
64 B) and that with AES-NI; it's only double that (~600-and-change
MB/s)?

And that for AES-128-CBC, with 8 kiB blocks, that I'm able to process
it at 465 MB/s without AES-NI, and 888 MB/s with?

That's MUCH faster than I expect it to be (even with AES-NI) and the
888 MB/s is faster than any available storage host-bus interface we've
got right now; so I want to make sure that I am not losing my marbles
here in trying to make some sense out of this data.

Thanks.

On Fri, Mar 15, 2013 at 5:23 AM, Erwann Abalea
erwann.aba...@keynectis.com wrote:
 On a PC under Linux, you can do a cat /proc/cpuinfo and look for aes in
 the flags.
 On a PC under any OS, get the CPUID, and look for bit 25 of ECX.
 That's not OpenSSL-related.


 The use of OPENSSL_ia32cap environment variable allows you to alter the
 CPUID result (only inside OpenSSL), and alter its behaviour. It's not
 resistant to a reboot, it's only process dependant.

 Compare the following results:

 OPENSSL_ia32cap=~0x202 openssl speed -elapsed -evp aes-128-cbc
 openssl speed -elapsed -evp aes-128-cbc


 --
 Erwann ABALEA

 Le 15/03/2013 04:46, Ewen Chan a écrit :

 Does it matter whether it's ia32 or ia64 even for an x64 processor?

 Shouldn't there be some way for me to check whether AES is enabled or
 being used (other than running a speed test) either in dmesg or /proc/
 or with openssl itself? I'm a little confused, and surprised/shocked
 that there isn't a way to probe the status of whether the AES-NI is a)
 present and b) enabled/utilized.

 re: OPENSSL_ia32cap=~0x202
 so forgive me for asking lots of dumb questions but that would be
 $ set OPENSSL_ia32cap=~0x202
 $ export OPENSSL_ia32cap

 correct?

 And how do I re-enable it without having to reboot the system? What's
 the value that I should be putting in on the right-hand-side of the
 equal sign?

 Your help is much appreciated.

 Sincerely,
 Ewen

 On Thu, Mar 14, 2013 at 7:35 PM, Dr. Stephen Henson st...@openssl.org
 wrote:

 On Thu, Mar 14, 2013, Ewen Chan wrote:

 So this is a partial continuation from the discussion thread that I
 started yesterday in regards to using AES-CBC.

 

Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Matthew Hall
On Sat, Mar 16, 2013 at 12:06:07AM -0400, Ewen Chan wrote:
 That's MUCH faster than I expect it to be (even with AES-NI) and the
 888 MB/s is faster than any available storage host-bus interface we've
 got right now; so I want to make sure that I am not losing my marbles
 here in trying to make some sense out of this data.
 
 Thanks.

Without AES-NI other still quite fast instructions are used from SSE, SSE2, 
etc. So it is a boost but not the be-all end-all boost.

In most cases the Core i7 will clobber most any storage device bar the very 
most costly SSDs. It's an extremely powerful CPU. If this isn't screaming fast 
enough, use GCM instead of CBC, and enjoy the multicore awesome thereof. ;)

As fast at it is, 10 GBE will still do 1280 MiB / sec. But it should fill up 
the 8 Gbps FC with no issues.

Matthew.
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Ewen Chan
Okay then, here's another one of my infamous dumb questions.

If that's the case, then why do we need the AES-NI instruction set?

If it's likely going to be storage and/or network bandwidth limited;
wouldn't the improvements made by introducing and incorporating the
AES-NI instruction set be kind of wasted in the sense that you can't
really use it to the fullest potential anyways?

If the storage/network I/O is going to be your bottleneck/limiting
factor, then regardless of whether you have AES-NI or not; you're
likely going to get the same answer in terms of speed.

(I actually copied the input file onto my SSD so as to try and
eliminate that as the bottleneck; and then write the output to the
same drive.)

(haha...it feels like that the more questions I ask, the more confused
I get...lol...)

Also, is that why (besides the fact that CBC can't be parallelized)
why it doesn't make sense or people really haven't spent too much time
or effort into trying to run AES encryption/decryption on GPGPUs?
Because it's already faster than anything else is capable of at the
moment?

On Sat, Mar 16, 2013 at 1:05 AM, Matthew Hall mh...@mhcomputing.net wrote:
 On Sat, Mar 16, 2013 at 12:06:07AM -0400, Ewen Chan wrote:
 That's MUCH faster than I expect it to be (even with AES-NI) and the
 888 MB/s is faster than any available storage host-bus interface we've
 got right now; so I want to make sure that I am not losing my marbles
 here in trying to make some sense out of this data.

 Thanks.

 Without AES-NI other still quite fast instructions are used from SSE, SSE2,
 etc. So it is a boost but not the be-all end-all boost.

 In most cases the Core i7 will clobber most any storage device bar the very
 most costly SSDs. It's an extremely powerful CPU. If this isn't screaming fast
 enough, use GCM instead of CBC, and enjoy the multicore awesome thereof. ;)

 As fast at it is, 10 GBE will still do 1280 MiB / sec. But it should fill up
 the 8 Gbps FC with no issues.

 Matthew.
 __
 OpenSSL Project http://www.openssl.org
 User Support Mailing Listopenssl-users@openssl.org
 Automated List Manager   majord...@openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl-users] Re: having a lot of troubles trying to get AES-NI working

2013-03-15 Thread Matthew Hall
On Sat, Mar 16, 2013 at 01:16:23AM -0400, Ewen Chan wrote:
 Okay then, here's another one of my infamous dumb questions.
 
 If that's the case, then why do we need the AES-NI instruction set?

It's far from the first accelerated instruction set of dubious utility. ;)

Marketing... etc.

Actually, SSL / TLS performance is much more greatly increased by an RSA 
accelerator. If I were Intel I would have made that first, before AES-NI, 
because RSA signs and verifies consume a lot more resources and are a lot more 
vulnerable to DoS than AES. But, of course, RSA is more complex.

The tech companies are not trying to make the best possible product, but the 
best product that's economically feasible, which is a slightly different goal.

 If it's likely going to be storage and/or network bandwidth limited;
 wouldn't the improvements made by introducing and incorporating the
 AES-NI instruction set be kind of wasted in the sense that you can't
 really use it to the fullest potential anyways?

Amdahl's Law: the amount of overall improvement of performance by improving an 
area is proportional to the amount that area is executed.

 If the storage/network I/O is going to be your bottleneck/limiting
 factor, then regardless of whether you have AES-NI or not; you're
 likely going to get the same answer in terms of speed.

Yes!

 Also, is that why (besides the fact that CBC can't be parallelized)
 why it doesn't make sense or people really haven't spent too much time
 or effort into trying to run AES encryption/decryption on GPGPUs?
 Because it's already faster than anything else is capable of at the
 moment?

Like Erwann said, the memory transfers would likely cost more time than using 
AES-NI.
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org