I measured on Nocona3.6GHz. with no-asm, the results are: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc     117668.96k   127171.52k   134233.69k   135039.66k   135012.87k

with asm, the results are:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      74138.23k   127064.68k   158567.68k   169018.37k   169525.25k

So, on 8192 bytes there are 25% performance boost,

This sounds like original version. Yesterday updated version was uploaded, which was benchmarked at 160860k at 3.0GHz Xeon, so try the very latest snapshot too. Do you get +190m at 3.6GHz?

however why on 16
bytes, the performance degrade a lot?

This is perfectly expected, because CBC assembler implementation attempts to mitigate impact from cache-timing attack by copying key schedule to controlled place on the stack and prefecthing s-box tables. And in "16 bytes" case it does this for every 16 bytes, and so on. Naturally it affects small block performance. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to