SSL negotiation (where the device is the server) takes about 2s
as it currently stands, and that's with the current MIPS assembler
support in OpenSSL.
I grabbed openssl-SNAP-20120917
...
./openssl-generic32 speed aes-128-cbc sha rsa1024
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 192.15k 639.49k 1709.12k 2902.92k 3385.66k
aes-128 cbc 1114.94k 1180.75k 1197.42k 1192.74k 1096.80k
sha256 145.81k 468.03k 942.52k 1262.14k 1358.08k
sha512 3749.33 14.97k 20.57k 27.62k 30.72k
sign verify sign/s verify/s
rsa 1024 bits 0.363571s 0.011161s 2.8 89.6
./openssl-mips32r2 speed aes-128-cbc sha rsa1024
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 297.78k 964.57k 2479.43k 4027.96k 4487.91k
aes-128 cbc 1284.54k 1369.60k 1392.86k 1396.18k 1272.49k
sha256 239.27k 666.35k 1385.98k 1897.49k 2035.75k
sha512 3752.82 14.99k 20.50k 27.65k 30.72k
sign verify sign/s verify/s
rsa 1024 bits 0.135811s 0.009086s 7.4 110.1
Thanks!
Thank *you*! I was expecting a bit better performance (in absolute terms), but
it might be limited by interface to external memory. At least sha512
performance is exceptionally bad and it surely depends on poor external memory
performance. I mean it has to keep data in memory more than any other
algorithm in question and that's what is likely to hurt it that much. It should be
only few times slower than sha256 (e.g. 2.2 was observed on R5000), not *60* as
you measured. >2x improvement in rsa1024 is also unexpected, but the other way,
it more than expected, so that one can't complain here...
Now to the original question. You said that SSL negotiation takes 2s (server
side you said, assuming RSA, i.e. RSA sign is dominating), and it's open
question if it's fast enough for your purposes. If you are using 1024-bit key,
then it should go faster. 2s probably means that you're spending
notable portion of the time elsewhere, most likely synthesizing randoms. The
datasheet you referred to mentions that the microcontroller in question has
TRNG and it definitely should improve situation if you find a way to utilize
it. If it's longer key we're talking about, then... Well, as
mentioned, CPU in question implements SmartMIPS extension, which means that there
is room for further improvement. It's hard to estimate...
https://www.mips.com/products/processor-cores/classic/mips32-4k/ mentions 15ms for
rsa1024 sign at 200MHz, which is >4x better than above result (scaled for
96MHz)... Well, it might be possible if one implements dedicated procedure
targeting specifically 1024-bit key operations (modulo security
counter-measures implemented in OpenSSL), otherwise 2x is probably more
feasible for general case...
Datasheet also mentions that CPU in question has support for hardware AES-128.
It would surely make difference if one manages to utilize it. Just keep in mind
that in such case hash function would be the limiting factor. I mean if
encryption gets a lot faster, you still have to hash the data, so you
won't be able to break ~4.5MBps for SHA1.
I'm using rsa2048 currently for the server certificate and it is taking a long
time...probably due to that. Here are aes-256 and rsa2048 numbers from that
device in case you're interested.
# ./openssl-mips32r2 speed aes-256-cbc rsa2048
Doing aes-256 cbc for 3s on 16 size blocks: 179907 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 46629 aes-256 cbc's in 2.97s
Doing aes-256 cbc for 3s on 256 size blocks: 11975 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 2994 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 8192 size blocks: 345 aes-256 cbc's in 3.00s
Doing 2048 bit private rsa's for 10s: 9 2048 bit private RSA's in 10.10s
Doing 2048 bit public rsa's for 10s: 389 2048 bit public RSA's in 9.95s
OpenSSL 1.1.0-dev xx XXX xxxx
built on: Mon Sep 17 09:34:52 EDT 2012
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int)
blowfish(ptr)
compiler: mipsel-linux-gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN
-DHAVE_DLFCN_H -D_FILE_OFFSET_BITS=32 --sysroot=/opt/uclibc -mips32r2 -mabi=32
-DTERMIO -O3 -Wall -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 959.50k 1004.80k 1021.87k 1018.56k 942.08k
sign verify sign/s verify/s
rsa 2048 bits 1.122222s 0.025578s 0.9 39.1
Anyhow, I'll probably end up having to switch to rsa1024 to get
reasonable enough performance. The performance of the AES operations
is within the realm of reasonability for this platform, it's just
RSA that's killing me :)
Thanks.
-Brad
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-dev@openssl.org
Automated List Manager majord...@openssl.org