Paul Zimmermann <paul.zimmerm...@inria.fr> writes: > 1) the use of mpn_add_n_sub_n is not activated by default in mul_fft.c. > It might give a small speedup in some cases.
I think add_n_sub_n was originally motivated by improved locality (could apply at different levels of memory hierarcy). But maybe we could get close to twice the speed using newer instructions with multiple carry flags (I guess that's what powerpc64/mode64/p9/add_n_sub_n.asm is doing)? We could probably do something similar on x86_64 with adox and adcx (if corresponding subtract instructions are missing, on-the-fly negation should be fairly efficient). Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel