James Cloos <[email protected]> writes:

> To give an example of how high, openssh's C implementation of chacha20
> with poly1305 is faster than openssl's non-aesni amd64 assembly for
> aes128-gcm, and both significantly outperform ssh's use of openssl's
> aes128-ctr or -ccb assembly with openssh's umac-64.

Benchmarking nettle's implementation on my office machine (core i5),

algorithm       cycles/byte
salsa20         5.3
aes128          11
aes128          22 (openssl)
arcfour         7.5
arcfour         3.75 (openssl)

(For aes, I'm surprised by the big difference to openssl. Nettle's aes
assembly is pretty basic, and on this machine it seems to give a very
marginal improvement over the C implementation, which runs at 12
cycles/byte. Maybe something is fishy with the ubuntu openssl package,
or there's some problem with my benchmarking).

Anyway, getting back to chacha, it will be interesting to see how much
faster chacha is than salsa20.

If I remember the chacha changes correctly, one gets rid of a
permutation of the matrix, and I think some of the rotations in the
round function (done as movaps, pslld, psrld, pxor) can be replaced by a
pshufd. I think that can reduce the instruction count for the round
function by 25-50%, depending on how many of the rotations can be
replaced (there ought to be at least one rotation left with a rotation
count which isn't a multiple of 8).

> like gcm, safer than most current usage of separate macs.

Are you saying that chacha + poly1305 is not used in the obvious way as
a stream cipher + a separate mac? Care to elaborate?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to