But given the performance swing amplitude (2.5x one way or another) we most likely would have to detect P4 at run-time and fall down to alternative code path... In *both* 32- and 64-bit cases...

I failed to compose blended code which would perform satisfactory on both P4 and non-P4 cores and implemented alternative RC4_CHAR assembler code-path engaged only if executed on P4. The code was benchmarked on 32-bit P4 only, where performance improvement was measured at 2.8x:-) It should be noted that CVS versions of assembler modules alone won't give any performance improvement on P4 if linked into 9.7. One has to benchmark 9.8. If you prefer source tar-ball, then you have to wait till at least openssl-SNAP-20041122.tar.gz becomes available. A.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]

Reply via email to