On Fri, 2019-03-15 at 22:33 +0100, Niels Möller wrote: > Simo Sorce <[email protected]> writes: > > > Turns out the algorithm is not equivalent, as the shift is applied to > > the array as if it were a big 128bit little endian value, the endianess > > of the two is different. > > Ah, I see. > > > /* shift one and XOR with 0x87. */ > > /* src and dest can point to the same buffer for in-place operations */ > > static void > > xts_shift(union nettle_block16 *dst, > > const union nettle_block16 *src) > > { > > uint8_t carry = src->b[15] >> 7; > > dst->u64[1] = (src->u64[1] << 1) | (src->u64[0] >> 63); > > dst->u64[0] = src->u64[0] << 1; > > dst->b[0] ^= 0x87 & -carry; > > } > > This will then work only on little-endian systems? > > I think it would be nice with a structure like > > b0 = src->u64[0]; b1 = src->u64[1]; /* Load inputs */ > ... swap if big-endian ... > uint64_t carry = (b1 >> 63); > b1 = (b1 << 1) | (b0 >> 63) > b0 = (b0 << 1) ^ (0x87 & -carry); > ... swap if big-endian ... > dst->u64[0] = b0; dst->u64[1] = b1; /* Store output */ > > I.e., no memory accesses smaller than 64-bits. > > Possibly with load + swap and swap + store done with some > system-dependent macros. > > But it's not essential for a first version of xts; copying block_mulx > and just replacing READ_UINT64 with LE_READ_UINT64 and similarly for > WRITE would be ok for now. There are more places with potential for > micro-optimizations related to endianness. While I think the > READ/WRITE_UINT macros are adequate in most places where unaligned > application data is read and written by C code.
I will add the macros to swap endianess, and resend a new version. Simo. -- Simo Sorce Sr. Principal Software Engineer Red Hat, Inc _______________________________________________ nettle-bugs mailing list [email protected] http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs
