Keep in mind that [unlike Gladman's code] OpenSSL code has to be position
independent! It surely no problem on x86_64, but on x86 this puts you in very
tight spot. But I've sketched some 32-bit PIC code already [as previously
mentioned "I might have an opportunity to play with AES some day *this*
year"], so give me few more days...
oh i guess you need to throw away another register on 32-bit x86 to load up the table base address.
perhaps if you copied the key-schedule/context to the stack you could refer to it off %esp, and then use %ebp as a base register for the tables? it would pay if you can amortize the stack copy over multiple blocks...
That would require [major] surgery to API and will most likely push a bunch of "front-end" functions such as AES_cbc_encrypt to assembler... As it's unlikely to result in further *significant* improvement, I'd rather not:-)
Just for the record. As was shown by Dean one can expect ~30% *asymptotic* gain resulting from making a copy of key schedule into controlled place on stack. "Asymptotic" means for larger chunk-sizes only and implies API surgery. It also means that that small-packet oriented applications [such as ssh] are likely to suffer (though this can be avoided by maintaining two code-pathes and choosing one at run-time depending on input lenght:-) "Controlled place on stack" implies ""front-end" functions being implemented in assembler"... Is there interest for this? A.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [EMAIL PROTECTED]
