This is a poll for votes.
It was noted that [at least] Intel IA-32 compiler, linux-ia32-icc
target, generates *noticeably*, 30% to be specific, faster code for SHA1
than hand-coded assembler implementation on at least P4 platform. I have
re-tuned SHA1 assembler implementation which now performs as following:
compared with current compared with icc
assembler impl. generated code
Pentium -25% +37%
PIII/AMD +8% +16%
P4 +85%(!) +45%
Options for integrating re-tuned code are:
1. replace crypto/sha/asm/sha1-586.pl and let couple of Pentium users
suffer 25% performance loss;
2. add crypto/sha/asm/sha1-686.pl, make it default, so that couple of
Pentium users *can* pull old code if they need 25% back;
3. add crypto/sha/asm/sha1-686.pl and have ./config choose between two
versions, depending on which computer ./config is executed;
My personal vote is #1. If nobody speaks up within 3-4 days, I'll
replace sha1-586.pl with another version. A.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]