On 02/06/2011 12:08 AM, Niels Möller wrote:

> The unoptimized GF(2^128) multiply function really is awfully slow. On
> x86_64, gmac takes 830 cycles/byte! We can compare to the sha functions,
> where sha1, sha256 and sha512 take respectively 8, 18 and 12
> cycles/byte, so the current code is two orders of magnitude slower than
> hmac-sha1.
> It remains to see how much table space and/or assembly hacking is needed
> to get reasonable performance.

There is a special instruction for that on new intel and AMD CPUs...
http://software.intel.com/en-us/articles/intel-carry-less-multiplication-instruction-and-its-usage-for-computing-the-gcm-mode/
http://en.wikipedia.org/wiki/CLMUL_instruction_set

Unfortunately I don't have anything close to those cpus...

regards,
Nikos
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to