"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes: Interesting, but writing an assembly mpn_mul means writing manyversions...
The idea is not to "write mpn_mul in assembly" in the sense to write any of what is currently in generic/mul.c in assembly. I suppose as assembly mul.c would look like this: if (un == 1) // vn == 1 too umul_ppmm else if (((up xor vp) or (un xor vn)) == 0) tailcall mpn_sqr else if (ABOVE_THRESHOLD (vm, MUL_TOOM22_THRESHOLD)) tailcall mpn_mul_toom22_or_greater_written_in_C mpn_mul_basecase code here On systems where tailcalls are possible (like amd64 ELF systems) this will be very low-overhead also for sqr. On the generic (or no-asm) side, we could at least swap the first branches in mpn_mul. Currently we have: if (un == vn) { if (up == vp) mpn_sqr (prodp, up, un); else mpn_mul_n (prodp, up, vp, un); } else if (vn < MUL_TOOM22_THRESHOLD) { /* plain schoolbook multiplication */ If we start with the if (vn < MUL_TOOM22_THRESHOLD) branch, we can (almost) directly consider _mul_basecase for small operands. Of course this is a slowdown for code that (should we keep on supporting this?) calls mpn_mul(r,a,n,a,n) instead of mpn_sqr(r,a,n). It is surely silly that we don't have any mpn call for when it is known that the multiplier and multiplicand are distinct. -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel