As for RFC part. NO! This is NOT the way to do it. For several reasons (in ascending order of importance):
- OpenSSL assembler modules are maintained as dual-ABI, i.e. suitable for both Unix and Win64; - "and $-16, %rdx" is unacceptable in this context. The relevant interface is exposed to end-user and we have to reserve for possibility that key schedule is memcpy-ed to location with alternative alignment; - zero-copy CBC routine gives a fair performance improvement even in ordinary case, and driving ultra-fast block function from C would be just wasteful. In other words AESENC/DEC would benefit more from dedicated CBC routine (see even comment below); - implementation should allow for pipelining; As for the latter. I refer to possibility of scheduling of multiple AESENC/DEC with same key schedule element and multiple data chunks. It's possible in modes that allow for parallelization (e.g. ECB, CBC decrypt, CTR), and as far as I understand it is even recommended. So we are kind of obliged to reserve for this option. The answer is engine. I mean this preferably should be implemented as engine that will be able to take full advantage of architecture, not as patch to general purpose block function. > This patch adds support to Intel AES-NI instruction set for x86_64 > platform. > > Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) > instructions that are going to be introduced in the next generation of > Intel processor, as of 2009. Hardware however is not expected before 2010, right? A. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]