SSL negotiation (where the device is the server) takes about 2s
as it currently stands, and that's with the current MIPS assembler
support in OpenSSL.

I grabbed openssl-SNAP-20120917
...
./openssl-generic32 speed aes-128-cbc sha rsa1024
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha1               192.15k      639.49k     1709.12k     2902.92k     3385.66k
aes-128 cbc       1114.94k     1180.75k     1197.42k     1192.74k     1096.80k
sha256             145.81k      468.03k      942.52k     1262.14k     1358.08k
sha512            3749.33        14.97k       20.57k       27.62k       30.72k
                  sign    verify    sign/s verify/s
rsa 1024 bits 0.363571s 0.011161s      2.8     89.6


./openssl-mips32r2  speed aes-128-cbc sha rsa1024
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha1               297.78k      964.57k     2479.43k     4027.96k     4487.91k
aes-128 cbc       1284.54k     1369.60k     1392.86k     1396.18k     1272.49k
sha256             239.27k      666.35k     1385.98k     1897.49k     2035.75k
sha512            3752.82        14.99k       20.50k       27.65k       30.72k
                  sign    verify    sign/s verify/s
rsa 1024 bits 0.135811s 0.009086s      7.4    110.1


Thanks!

Thank *you*! I was expecting a bit better performance (in absolute terms), but 
it might be limited by interface to external memory. At least sha512 
performance is exceptionally bad and it surely depends on poor external memory 
performance. I mean it has to keep data in memory more than any other
algorithm in question and that's what is likely to hurt it that much. It should be 
only few times slower than sha256 (e.g. 2.2 was observed on R5000), not *60* as 
you measured. >2x improvement in rsa1024 is also unexpected, but the other way, 
it more than expected, so that one can't complain here...

Now to the original question. You said that SSL negotiation takes 2s (server 
side you said, assuming RSA, i.e. RSA sign is dominating), and it's open 
question if it's fast enough for your purposes. If you are using 1024-bit key, 
then it should go faster. 2s probably means that you're spending
notable portion of the time elsewhere, most likely synthesizing randoms. The 
datasheet you referred to mentions that the microcontroller in question has 
TRNG and it definitely should improve situation if you find a way to utilize 
it. If it's longer key we're talking about, then... Well, as
mentioned, CPU in question implements SmartMIPS extension, which means that there 
is room for further improvement. It's hard to estimate... 
https://www.mips.com/products/processor-cores/classic/mips32-4k/ mentions 15ms for 
rsa1024 sign at 200MHz, which is >4x better than above result (scaled for
96MHz)... Well, it might be possible if one implements dedicated procedure 
targeting specifically 1024-bit key operations (modulo security 
counter-measures implemented in OpenSSL), otherwise 2x is probably more 
feasible for general case...

Datasheet also mentions that CPU in question has support for hardware AES-128. 
It would surely make difference if one manages to utilize it. Just keep in mind 
that in such case hash function would be the limiting factor. I mean if 
encryption gets a lot faster, you still have to hash the data, so you
won't be able to break ~4.5MBps for SHA1.

I'm using rsa2048 currently for the server certificate and it is taking a long
time...probably due to that.  Here are aes-256 and rsa2048 numbers from that
device in case you're interested.

# ./openssl-mips32r2  speed aes-256-cbc rsa2048
Doing aes-256 cbc for 3s on 16 size blocks: 179907 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 46629 aes-256 cbc's in 2.97s
Doing aes-256 cbc for 3s on 256 size blocks: 11975 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 2994 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 8192 size blocks: 345 aes-256 cbc's in 3.00s
Doing 2048 bit private rsa's for 10s: 9 2048 bit private RSA's in 10.10s
Doing 2048 bit public rsa's for 10s: 389 2048 bit public RSA's in 9.95s
OpenSSL 1.1.0-dev xx XXX xxxx
built on: Mon Sep 17 09:34:52 EDT 2012
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) 
blowfish(ptr)
compiler: mipsel-linux-gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -D_FILE_OFFSET_BITS=32 --sysroot=/opt/uclibc -mips32r2 -mabi=32 
-DTERMIO -O3 -Wall -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc        959.50k     1004.80k     1021.87k     1018.56k      942.08k
                  sign    verify    sign/s verify/s
rsa 2048 bits 1.122222s 0.025578s      0.9     39.1


Anyhow, I'll probably end up having to switch to rsa1024 to get
reasonable enough performance.  The performance of the AES operations
is within the realm of reasonability for this platform, it's just
RSA that's killing me :)

Thanks.
-Brad
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to