On 14.01.2016 16:54, Rob Austein wrote:
At Thu, 14 Jan 2016 11:01:38 +0300, Pavel Shatov wrote:

Rob, could you please dump "rho" for all the three curves in
ecdsa_curves.h? I have a feeling, that "rho" will always be just 1 (one).

Ah.  Hadn't noticed that, I'm just calling libtfm setup function:

   sw/thirdparty/libtfm/tomsfastmath/src/mont/fp_montgomery_setup.c

but you're correct that the computed value is one for all three of
those curves.

That's what I thought. Well, the "trick" of Montgomery reduction is to shift the temporary product to the right after every iteration to prevent bit width growth. This can only be done if lower bits are zeroes. Lower bits can be zeroed out by adding multiples of the modulus.

Now if reduction is done bit-by-bit, then one modulus is added to the temporary result, if its lowest bit is set (given that the modulus must be odd, adding it to an odd temporary result will produce an even number with the lowest bit set to zero). This lowest zero bit can then be safely shifted away.

Reduction can also be done word-by-word, which is much faster. That's how FPGA (and apparently libtfm) works. In that sense "fp_digit" is actually a 32-bit number, so the algorithm zeroes out 32 bits a time. To do this one needs a special speed-up factor, that depends on lower 32 bits of the modulus. Btw, that's why you have to toggle the init bit of ModExpS6 core after you change modulus -- the core has to pre-calculate the new speed-up factor. I guess setup function in libtfm does the same.

NIST primes all have their lower 32 bits set to ones, so the speed-up factor becomes just 1, there's no need for FPGA to calculate it at all. Since I'm trying to write ECDSA core, not general-purpose EC math core, I thought, that it would make sense to take advantage of the fact and get rid of that redundant coefficient.


--
With best regards,
Pavel Shatov

_______________________________________________
Tech mailing list
Tech@cryptech.is
https://lists.cryptech.is/listinfo/tech

Reply via email to