Niels Möller <[email protected]> writes:

> In initial benchmarking, this loop appears to run in 4.2 cycles per
> iteration on my laptop, and a slowdown by a factor of 3 compared to the
> current C implementation of ghash_update. Penalty may be a bit less for
> assembly implementation, but I haven't tried.

I've got the code to work, and I've written an x86_64 assembly
implementation using sse2 instructions. Code on the
ghash-sidechannel-silent branch. On my laptop, I seem toget these
numbers:

Old C implementation: 350 MB/s
Old asm implementation: 388 MB/s,

New C implementation: 116 MB/s
New asm implementation: 196 MB/s

pclmul implementation: 4047 MB/s

In the new asm code, the inner loop is 

.Loop_bit:
        movaps  ONE, M0
        pand    X, M0
        pcmpeqd ONE, M0
        pshufd  $0xaa, M0, M1
        pshufd  $0, M0, M0
        psrlq   $1, X
        pand    (KEY, CNT), M0
        pand    1024(KEY, CNT), M1
        pxor    M0, R
        pxor    M1, R

        add     $16, CNT
        jnz     .Loop_bit

it appears to run in 283 cycles/block, or 4.4 cycles per iteration of
above loop. I think that indicates that the bottleneck is instruction
issue, 3 instructions per cycle on this processor (AMD Ryzen 5). It's a
bit annoying that it takes as many as 5 instructions to extend the two
bits from X (bit indices 0 and 64) to the two 128-bit mask words M0, M1.
Maybe there's some more clever way?

I can see some possible improvements; one could use the sign bit
instead, replacing the first three instructinos by two: movaps X, M0;
psrlq $63, M0. Or one could do 4 bits (e.g., sign bits 127, 95, 63, 31)
instead of just 2, wit only two more pshufd to create the additonal
masks. Together, I think that would be a loop of 17 instructions for
doing 4 bits.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to