About primes and speed: I think unsaturated limb arithmetic is the way to go, as it lets (a+b)*(c+d) be computed with only one carry chain (in the multiply) We don't need multiple additions for ECC. A prime of the form 2^b-a, for a small, is limb size agnostic, while pseudo-Mersenne primes seem to strongly favor some limbs over others.
However, this gets down to "implement and report" territory. The naive saturated arithmetic+Barret/Montgomery reduction is a pretty good first cut as far as I know, but does impact vectorization. Unsaturated arithmetic plus hand crafted reduction is tougher to write, but can be faster. Until people think hard about it, we don't know what works better per each prime. The best way is probably to compare primes on working implementations. Sincerely, Watson Ladd _______________________________________________ Curves mailing list [email protected] https://moderncrypto.org/mailman/listinfo/curves
