On 1/31/14, Mike Hamburg <[email protected]> wrote: > You can access ADC by using __uint128_t on clang and gcc (and possibly on > other platforms). It's still faster in assembly or intrinsics, mostly due > to the register allocator barfing on EC numerics code, but it still pretty > much works in C.
But not in ADC's full generality. As far as I know, there is no way to use ADC to propagate a carry all the way through a larger-than-128-bit number from C, even with the 128-bit type. (ADC is also available in that limited sense in 32-bit mode using the 64-bit types.) > ADC is also passably fast on most processors, though it works better on AMD > than Intel I've heard. At 256 bits, it's not necessarily worth having extra > limbs to reduce the number of ADC instructions. The Ed25519 paper reports that ADC can be used only once every two cycles on then-recent Intel processors, compared to up to three ADDs in every cycle; and that because of that limitation on ADC, 5 51-bit limbs are indeed faster than 4 64-bit limbs on those processors. (But Samuel Neves reports that Intel processors are improving.) > At 384 bits, it may be > worth going to extra limbs, and also Karatsuba may be profitable. By 448 > bits, you almost definitely want both reduced-radix and Karatsuba. I think > 2^521-1 probably wants 9x58-bit limbs and 3-way Karatsuba, but I haven't > tuned my implementation yet. > > Diego, have you implemented arithmetic mod the primes in your paper? Do you > know whether they're fast or not, and with what implementations, and maybe > even on what platforms, or are you speculating? I don't know about him, but I'm speculating (though with a few calculations to support them). My main interest (for curves for long-term security) is in simple C implementations, though I do want to choose curves which I know will be efficiently implementable on NEON vector units. Robert Ransom _______________________________________________ Curves mailing list [email protected] https://moderncrypto.org/mailman/listinfo/curves
