As for RFC part. NO! This is NOT the way to do it. For several reasons
(in ascending order of importance):

- OpenSSL assembler modules are maintained as dual-ABI, i.e. suitable
for both Unix and Win64;
- "and $-16, %rdx" is unacceptable in this context. The relevant
interface is exposed to end-user and we have to reserve for possibility
that key schedule is memcpy-ed to location with alternative alignment;
- zero-copy CBC routine gives a fair performance improvement even in
ordinary case, and driving ultra-fast block function from C would be
just wasteful. In other words AESENC/DEC would benefit more from
dedicated CBC routine (see even comment below);
- implementation should allow for pipelining;

As for the latter. I refer to possibility of scheduling of multiple
AESENC/DEC with same key schedule element and multiple data chunks. It's
possible in modes that allow for parallelization (e.g. ECB, CBC decrypt,
CTR), and as far as I understand it is even recommended. So we are kind
of obliged to reserve for this option.

The answer is engine. I mean this preferably should be implemented as
engine that will be able to take full advantage of architecture, not as
patch to general purpose block function.

> This patch adds support to Intel AES-NI instruction set for x86_64
> platform.
> 
> Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD)
> instructions that are going to be introduced in the next generation of
> Intel processor, as of 2009.

Hardware however is not expected before 2010, right? A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to