With SSE2 disabled:

> openssl speed sha-512:
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha-512           1050.62k     4223.53k     6141.97k     8488.01k     9480.48k

with SSE2 enabled:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha-512           3171.75k    12757.93k    22761.88k    34514.56k    40059.42k

I ran the test several times, with similar consistent speed increase.
> 400% on large blocks. Many thanks to Andy for the code.

BTW. The method of enabling SSE2 via OPENSSL_ia32cap is IMHO
a kludge. What is 0x04000000 in decimal anyway?

On djgpp where I tested this, we are free to use whatever CPU
instructions that's supported. Only trouble is getting at the CR4 register.
djgpp also has a SIGILL handler, so it could fall-back to non-SSE2 
method. I have some CPU detection code that could set OPENSSL_ia32cap 
programmatically if that's desired.

Tests done with djgpp under a Win-XP DOS-box on a 2.1GHz 
Pentium 4 CPU.

--gv


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to