Simo Sorce <[email protected]> writes:

> Just for curiosity, would it be ok change,
> LE_READ_UINT64/LE_WRITE_UINT64 to use uint64_t and have the macro use
> __bswap64 if available or fall back to the original unoptimized code ?

I think LE_READ_UINT64 and related macros should be left as is,
and used for unaligned reads and writes.

For xts_shift, there are three cases, and I think they could be handled
like this:

1. Little-endian system:
   Use 64-bit reads and writes, no swaps needed.

2. Big-endian system, __builtin_bswap64 detected (HAVE_BUILTIN_BSWAP64):
   Use 64-bit reads and writes + __builtin_bswap64.

3. Big-endian system, no __builtin_bswap64. Here we can either use the
   current code, with byte accesses only. Or attempt to define byteswap
   without builtins and follow 2. I'd lean towards using the current
   code, unless there's some system where something more complicated is
   known to improve performance.

In case (1) or (2), use 64-bit operations exclusively, also for the
carry logic. And then the macros and #ifdefs should be arranged to make
it not too ugly. E.g., cases (1) and (2) could perhaps be combined, with
a BSWAP64_IF_BE macro or the like. Or a macro like
LE_READ_ALIGNED_UINT64 (with an uint64_t * argument)

Only current use of __builtin_bswap64 is in ctr.c, the ctr_fill16
function.

An older example is in salsa20-core-internal.c, with a LE_SWAP32 macro.
That one could take advantage of __builtin_bswap32, if available. I
don't know if the compiler can recognize the current expression as a
byte swap.

And it probably is more important to optimize xts_shift if/when one also
arranges the code to produces multiple blocks of T_k values at a time
(say, 32 blocks, 512 bytes of stack-allocated temporary storage),
without reading the intermediate values back from memory to registers.
That has been a significant optimization for both ctr mode and cbc
decrypt.

I haven't reviewed the new version of the patch yet, I hope to get to
that in a day or two.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to