I'm more used to dealing with PKCS#11 where the call overhead is usually measurable, but even so, doing just AES, probably not a problem. Doing something like AES-GCM with the AES in the engine and GCM Hash in OpenSSL though I'd expect to see an impact, you are basically doingthe AES a blcok at a time in that sceenario. That's where I'm claiming that you'll be sacrificing performance long term. And for instructions that are wired into the CPU and unpriviledged there's no real gain using an engine.
The other issue, FIPS, you already covered. Yes, I care for that reason as well, FIPS certifiying with code in an engne will be more difficult , but it really only impacts people who do their own FIPS certifications. Pretty much our problem to deal with it. Like I said though, your call. Peter From: Andy Polyakov <ap...@openssl.org> To: openssl-dev@openssl.org Date: 08/11/2011 05:00 Subject: Re: [openssl.org #2627] SPARC T4 support for OpenSSL Sent by: owner-openssl-...@openssl.org Peter Waltenberg wrote: > There are some fairly severe performance hits in engine support unless the > engine includes all the submodes as well. > That includes things you are just starting to play with now, like the combined > AES+SHA1 on x86. ??? Here is output for 'speed -engine intel-accel -evp aes-128-cbc-hmac-sha1' for 1.0.0d, i.e. through engine. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc-hmac-sha1 202516.18k 322609.98k 432125.60k 480232.03k 496191.36k And here is output for 'speed -evp aes-128-cbc-hmac-sha1' for HEAD, i.e. without engine. aes-128-cbc-hmac-sha1 237351.62k 326968.34k 432138.62k 482383.80k 497401.86k "Engine" overhead is significant at 16-byte chunks *only* and hardly noticeable otherwise. What severe performance hits are we talking about? EVP has overhead, but I can't see that it's engine specific. Combined cipher+hash implementations do minimize EVP overhead (you don't have to make two EVP calls), but that was not the reason for implementing above mentioned "stitched" modes, higher instruction-level parallelism was. > For features that are part of CPU's - rather than plug in cards - my preference > would be that the implementation is inline so that every last drop of > performance can eventually be wrung out of it. As mentioned, there are other factors in play, such as maintenance, adoption time... ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org