> > and produces correct result on all platforms at the nominal cost (I
> > estimate at most 5% across all platforms) of collecting 32-bit values
> > with 4 byte-loads and accompanying shift and or operations
> > (or couple of rotates and or if compiled with Microsoft C).
>
> "Well, I estimate the new implementation will do better than 5%." :)
Well, my original claim was about the cost of assembling of 32-bit
values with 4 single-byte-loads which is not more than 5% relative to
any particular implementation, not comparison between these two
implementations. But in either case I really fail to see how the
proposed implementation will be much faster than the one already present
in the tree. They're practically identical. The only difference is
byte-order of the S-boxes (and therefore order of shifts) and the way
the final round is handled. Well, not to mention already mentioned fact
that the original implementation operates on automatic variables.
> Perhaps, I'll produce some actual numbers using OpenSSL and
> both implementations to prove my case.
Note and respect that OpenSSL is cross-platform toolkit meaning that we
might face and resolve a trade-off.
> > The proposed code is IA-32 specific as IA-32 is the only
> > platform immune
> > to misaligned memory references.
>
> I don't believe this is true, but I'd be happy to see a
> specific example.
RIJNDAEL_ecb_encrypt(const unsigned char *src,
unsigned char *dst,
long size,
const RIJNDAEL_KEY *key,
int encrypt)
{
if (encrypt)
{
while (size >= RIJNDAEL_BLOCK)
{
RIJNDAEL_encrypt((const RIJNDAEL_WORD*) src,
(RIJNDAEL_WORD*) dst,
key);
...
RIJNDAEL_cbc_encrypt(const unsigned char *src,
unsigned char *dst,
...
if (encrypt)
{
while (size >= RIJNDAEL_BLOCK)
{
XOR_BLOCK(dst, src, iv);
RIJNDAEL_encrypt((const RIJNDAEL_WORD*) dst,
(RIJNDAEL_WORD*) dst,
key);
If either src or dst are misaligned code bombs with bus error on all
platforms, but IA-32. Well, it doesn't bomb on Alpha which handles
misaligned access in trap handler, but as it's trap, the performance
goes below any reasonable value which makes you wish badly it was
aligned.
Andy.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]