[email protected] (Niels Möller) writes:

> I think there are three main pieces left to integrate.
>
> 1. Curve operations to support Curve448 (i.e., diffie-hellman
>    operations). I have made some progress, on my curve448 branch,
>
> 2. SHAKE 128/256. I think I had some question on the interface design.
>
> 3. EdDSA 448.
>
> Optimization of the mod p arithmetic isn't that important yet,

I see.  I thought that the performance of curve operations should at
least be comparable to P-521.  However, even with the generic ecc_mod
for mod p, those are already close.  So let's look at the above items
first.  I have rebased my patch implementing (1) on the curve448 branch:
https://gitlab.com/dueno/nettle/commits/wip/dueno/curve448-2

One thing I noticed is that the point addition formula for untwisted
curves doesn't look correct:
https://gitlab.com/dueno/nettle/commit/4e3a50f4a50d8d03536dc107d7b77c84462e3068#6c80341e16ba39077bf2507d8450393d7e7e677a_261_262

> but I'll nevertheless try to explain how I think about it.

Thank you for the detailed explanation.  I ran the benchmark for those 3
variants: (1) the original version using ecc_mod, (2) the two step
reduction as you suggest, and (3) my formula optimized with single
7-limbs operations:

size   modp reduce   modq modinv mi_gcd mi_pow dup_jj ad_jja ad_hhh  mul_g  
mul_a (us)
 448 0.0727 0.0720 0.0739  44.01  1.451  52.92  1.088  1.456  1.406  299.6  
557.6
 521 0.0139 0.0151 0.1003  77.72  1.703 101.59  0.728  0.995  1.277  255.8  
588.4

 448 0.0496 0.0497 0.0764  34.77  1.500  49.59  0.923  1.158  1.169  273.5  
500.1
 521 0.0147 0.0144 0.1027  77.63  1.816  88.57  0.716  0.934  1.276  237.2  
589.9

 448 0.0641 0.0644 0.0809  52.76  1.570  49.42  1.007  1.340  1.343  288.1  
570.5
 521 0.0139 0.0141 0.0967  78.22  1.697  99.44  0.714  1.012  1.264  235.8  
589.2

on Core i7-6600U CPU @ 2.60GHz.

My code could be wrong or inefficient, but actually (2) is the fastest.
(3) is slower due to the final carry handling; the carry is accumulated
at most 3 and wrapping around it with cnd_add_n seems to be costly.

Regards,
-- 
Daiki Ueno
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to