This is a poll for votes.

It was noted that [at least] Intel IA-32 compiler, linux-ia32-icc
target, generates *noticeably*, 30% to be specific, faster code for SHA1
than hand-coded assembler implementation on at least P4 platform. I have
re-tuned SHA1 assembler implementation which now performs as following:

                compared with current   compared with icc
                assembler impl.         generated code
Pentium         -25%                    +37%
PIII/AMD        +8%                     +16%
P4              +85%(!)                 +45%

Options for integrating re-tuned code are:

1. replace crypto/sha/asm/sha1-586.pl and let couple of Pentium users
suffer 25% performance loss;
2. add crypto/sha/asm/sha1-686.pl, make it default, so that couple of
Pentium users *can* pull old code if they need 25% back;
3. add crypto/sha/asm/sha1-686.pl and have ./config choose between two
versions, depending on which computer ./config is executed;

My personal vote is #1. If nobody speaks up within 3-4 days, I'll
replace sha1-586.pl with another version. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to