Hi Eric,

2017-03-30 21:55 GMT+02:00 Eric Biggers <ebigge...@gmail.com>:
> This is an improvement; I'm just thinking that maybe this should be done for 
> all
> the gf128mul_x_*() functions, if only so that they use a consistent style and
> are all defined next to each other.

Right, that doesn't seem to be a bad idea... I was confused for a
while by the '& 0xff' in the _lle one, but now I see it also uses just
two values of the table, so it can be re-written in a similar way. In
fact, the OCB mode from RFC 7253 (that I'm currently trying to port to
kernel crypto API) uses gf128mul_x_bbe, so it would be useful to have
that one accessible, too.

I will move them all in v2, then.

> Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting
> compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes 
> the
> new version more efficient than one might expect:
>
>         sar    $0x3f,%rax
>         and    $0x87,%eax
>
> It could even be written the branchless way explicitly, but it shouldn't 
> matter.

I think the definition using unsigned operations is more intuitive...
Let's just leave the clever tricks up to the compiler :)

Thanks,
O.M.

>
> - Eric

Reply via email to