> On Oct 23, 2014, at 10:05 AM, Trevor Perrin <[email protected]> wrote:
> 
> On Thu, Oct 23, 2014 at 5:04 AM, Samuel Neves <[email protected]> wrote:
>> 
>> The Haswell cycle counts mentioned in the paper do not take Turbo Boost into 
>> account, and therefore are lower than the
>> real number; taking into account that the Core i7 4770 chip was used (3.4 to 
>> 3.9 GHz overclocking), the Haswell cycle
>> count should be ~893000.  I have been able to get this slightly down to 
>> ~884000.
>> 
>> On Sandy Bridge, I get somewhat better timings than reported by DJB: 
>> ~1030000 cycles.
> 
> Thanks!, updated [1].
> 
> By that scoring, Mike's Goldilocks implementation retains the
> "relative efficiency" crown.  But the E-521 numbers are without ASM
> optimization.  And their 9 limbs / 58-bit radix seems impressive
> (Goldlilocks uses 8 limbs / 56-bit radix).
> 
> So this seems pretty close, I wonder what a better-optimized 521 could do...
> 
> 
> Trevor
> 
> 
> [1] 
> https://docs.google.com/a/trevp.net/spreadsheet/ccc?key=0Aiexaz_YjIpddFJuWlNZaDBvVTRFSjVYZDdjakxoRkE&usp=sharing#gid=0

The Goldilocks code is almost ready to support E-521.  As a warmup non-Ed448 
curve, I took preliminary benchmarks for Ed480-Ridinghood.  From one benchmark 
run (not SUPERCOP, etc):
        Goldilocks: 178kcy keygen, 536kcy ecdh
        Ridinghood: 193kcy keygen, 617kcy ecdh
Difference = +8%, +15%.

The +15% reflects some sections which aren’t optimized yet, along the lines of 
if (EDWARDS_D > 0) { do something slow; } or if (Mike hasn’t calculated the 
carry handling limits yet) { reduce just to be safe; }

I also have a 521-bit multiplier which takes 145 Haswell cycles in preliminary 
benchmarks.  Like Granger-Scott, it uses 9 limbs of 58 bits each.  It’s still 
using 3-way Chung-Hasan, so it does more multiplies and fewer adds than the 
Granger-Scott technique.  Its speed advantage, if it actually has one, is 
probably from tighter tuning.  But if that’s accurate it might be comparably 
fast to what Granger and Scott quoted (but measured properly, with TurboBoost 
off).

— Mike
_______________________________________________
Curves mailing list
[email protected]
https://moderncrypto.org/mailman/listinfo/curves

Reply via email to