With SSE2 disabled: > openssl speed sha-512: ... type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha-512 1050.62k 4223.53k 6141.97k 8488.01k 9480.48k
with SSE2 enabled: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha-512 3171.75k 12757.93k 22761.88k 34514.56k 40059.42k I ran the test several times, with similar consistent speed increase. > 400% on large blocks. Many thanks to Andy for the code. BTW. The method of enabling SSE2 via OPENSSL_ia32cap is IMHO a kludge. What is 0x04000000 in decimal anyway? On djgpp where I tested this, we are free to use whatever CPU instructions that's supported. Only trouble is getting at the CR4 register. djgpp also has a SIGILL handler, so it could fall-back to non-SSE2 method. I have some CPU detection code that could set OPENSSL_ia32cap programmatically if that's desired. Tests done with djgpp under a Win-XP DOS-box on a 2.1GHz Pentium 4 CPU. --gv ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
