Hi,
The measurement I sent yesterday for OpenSSL (with inlined T4
instruction support) was not quite accurate.
Some of the T4 specific code you committed was not enabled when we
tested, and I realized that __sparc__ was not defined on our system.
Thus, I changed "#if defined(__sparc__)" to "#if defined(__sparc)".
Now, we are seeing better number with OpenSSL.
sign verify sign/s verify/s
rsa 1024 bits 0.000351s 0.000024s 2852.9 42311.0
rsa 2048 bits 0.001258s 0.000047s 795.1 21128.6
rsa 4096 bits 0.006240s 0.000395s 160.3 2533.3
Which is virtually identical to Linux results. So one mystery solved.
I'll commit the fix at some later point.
which is still slower than our t4 engine for 1k and 2k bit RSA sign:
sign verify sign/s verify/s
rsa 1024 bits 0.000237s 0.000028s 4221.9 36119.8
rsa 2048 bits 0.000876s 0.000075s 1141.7 13285.6
rsa 4096 bits 0.006341s 0.002139s 157.7 467.5
As mentioned the problem seems to be "multi-layer" and we are moving in
right direction.
So, I enabled "warm-up" as suggested by you, but the performance number
still look the same.
Well, suggestion was of "what-if" character, product of slight
desperation:-) But it appears to be unnecessary, so we leave it as it is.
I realized that, in sparct4-mont.pl, I see some 64-bit sparcv9 specific
code, but my 64-bit library doesn't have those instructions.
It looks like __arch64__ branch was taken. Did you expect the have the
SOPARCV9_64BIT_STACK section to be compiled in?
No. SPARCV9_64BIT_STACK is Linux-specific thing. In the commentary
section in crypto/bn/asm/sparct4-mont.pl you see paragraph that starts
with "32-bit code is prone to performance degradation." This is what
SPARCV9_64BIT_STACK is about.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [email protected]