ni...@lysator.liu.se (Niels Möller) writes: Yes. By one if divappr_2 can produce a useful single-limb remainder, otherwise by two.
I am very curious about the result of this work. A larger submul_1 tripcount is not for free, but the current 3/2 quotient approximation is also not for free. I suspect that significant speedup would come from early, speculative quotient approximation. We do its equivalent for some asm redc_1 and its sibling mpn_sbpi1_bdiv_r (see e.g. mpn/x86_64/zen/sbpi1_bdiv_r.asm). This was a significant speedup. Of course, for bdiv the quotient is not approximated and therefore the next one is not speculative. That makes bdiv easier to write and to get right, but I think the speedup potential is similar. -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel