Marton,
I think your card is simply slow. I've done similar test (RSA only)
using an IBM 2058 eServer Cryptographic Accelerator (ICA), which has 5 ultracyper crypto processors on it.
The machine is a dual xeon 2.4 box running Linux 2.4.20 I used openssl 0.9.7b with IBM's ibmca engine and libica, threading activated, both CPU's at 100% with the hardware engine deactivated, minimal main CPU usage with the engine activated.
For 2048bits the ICA could do almost 70 times as many signing operations than the two main CPUs could handle.
No, it's only a factor 1.5, see below.
/opt/src/openssl-0.9.7b/apps # ./openssl speed rsa Doing 512 bit private rsa's for 10s: 11089 512 bit private RSA's in 9.99s Doing 512 bit public rsa's for 10s: 120057 512 bit public RSA's in 10.00s Doing 1024 bit private rsa's for 10s: 2124 1024 bit private RSA's in 10.00s Doing 1024 bit public rsa's for 10s: 40108 1024 bit public RSA's in 10.00s Doing 2048 bit private rsa's for 10s: 347 2048 bit private RSA's in 10.02s Doing 2048 bit public rsa's for 10s: 11800 2048 bit public RSA's in 9.99s Doing 4096 bit private rsa's for 10s: 52 4096 bit private RSA's in 10.14s Doing 4096 bit public rsa's for 10s: 3321 4096 bit public RSA's in 9.99s OpenSSL 0.9.7b 10 Apr 2003 built on: Thu Sep 25 17:47:01 EDT 2003 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -m486 -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times sign verify sign/s verify/s rsa 512 bits 0.0009s 0.0001s 1110.0 12005.7 rsa 1024 bits 0.0047s 0.0002s 212.4 4010.8 rsa 2048 bits 0.0289s 0.0008s 34.6 1181.2 rsa 4096 bits 0.1950s 0.0030s 5.1 332.4
The following run of the speed program measures RSA operations per host cpu time, not per elapsed time; in other words it gives the performance of an hypothetical system using an accelerator card with infinitely high speed.
/opt/src/openssl-0.9.7b/apps # ./openssl speed -engine ibmca rsa engine "ibmca" set. Doing 512 bit private rsa's for 10s: 6942 512 bit private RSA's in 0.43s
^^ This is host CPU time, the elapsed time is 10s +/-.
Doing 512 bit public rsa's for 10s: 30522 512 bit public RSA's in 0.50s Doing 1024 bit private rsa's for 10s: 2139 1024 bit private RSA's in 0.32s Doing 1024 bit public rsa's for 10s: 19278 1024 bit public RSA's in 0.55s Doing 2048 bit private rsa's for 10s: 529 2048 bit private RSA's in 0.23s Doing 2048 bit public rsa's for 10s: 6651 2048 bit public RSA's in 0.14s RSA sign failure. No RSA sign will be done. 31561:error:8606706E:ibmca engine:IBMCA_MOD_EXP:mexp length to large:hw_ibmca.c:1051: RSA verify failure. No RSA verify will be done. 31561:error:04077077:rsa routines:RSA_verify:wrong signature length:rsa_sign.c:154: OpenSSL 0.9.7b 10 Apr 2003 built on: Thu Sep 25 17:47:01 EDT 2003 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -m486 -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times sign verify sign/s verify/s rsa 512 bits 0.0001s 0.0000s 16144.2 61044.0 rsa 1024 bits 0.0001s 0.0000s 6684.4 35050.9 rsa 2048 bits 0.0004s 0.0000s 2300.0 47507.1
Relating the measured numbers to 10 sec elapsed time gives following results:
sign/s verify/s rsa 512 bits 694 3052 rsa 1024 bits 214 1928 rsa 2048 bits 53 665
Redo the measurement with the -elapsed option, it should reproduce the just given results.
Ciao,
Richard
--
Dr. Richard W. Könning
Fujitsu Siemens Computers GmbH
______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]