Re: Anomaly in mpn_sqrtrem and mpn_rottrem
ni...@lysator.liu.se (Niels Möller) writes: t...@gmplib.org (Torbjörn Granlund) writes: Surely, we could make mpn_rootrem run faster, in particular for small arguments. But also for large arguments, 2x slowdown seems like a lot. I've had a quick look. Both mpn_dc_sqrtrem and mpn_rootrem_internal (which seem to be the work horses for larger operands) do a division in the loop. The latter is organized as a newton iteration, the former isn't (but I guess the computation is more or less equivalent to a Newton iteration). The code is a bit too complex for me to really compare them... Ahum, same sensation here... For larger operands, I imagine a rewrite to use a division-free Newton-iteration for an approximation of the inverse root would be faster. I played some years ago with a loop computing A^(-0.5), which naturally becomes division-free. An alternative would be to keep the current algorithm, but use the fact that the previous divisor is a damn good staring approximation to the new one, and use Newton to maintain an inverse to it. For mpn_rootrem, i.e., a^(1/k), I suppose we could make the work take O(M(log(a^(1/k as opposed to the current O(M(log(a))), where M(n) is the time for an n-bit multiply. We'd need to make use of mpn_pow_1_highpart. This will be a lot of work. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Anomaly in mpn_sqrtrem and mpn_rottrem
t...@gmplib.org (Torbjörn Granlund) writes: Surely, we could make mpn_rootrem run faster, in particular for small arguments. But also for large arguments, 2x slowdown seems like a lot. I've had a quick look. Both mpn_dc_sqrtrem and mpn_rootrem_internal (which seem to be the work horses for larger operands) do a division in the loop. The latter is organized as a newton iteration, the former isn't (but I guess the computation is more or less equivalent to a Newton iteration). The code is a bit too complex for me to really compare them... For larger operands, I imagine a rewrite to use a division-free Newton-iteration for an approximation of the inverse root would be faster. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Anomaly in mpn_sqrtrem and mpn_rottrem
t...@gmplib.org (Torbjörn Granlund) writes: For mpn_rootrem, i.e., a^(1/k), I suppose we could make the work take O(M(log(a^(1/k as opposed to the current O(M(log(a))), where M(n) is the time for an n-bit multiply. We'd need to make use of mpn_pow_1_highpart. This will be a lot of work. Note that O(M(log(a^(1/k = O(M(log(a)/k)), i.e., that we can speed mpn_rootrem up by a factor k (at least for large k, and large a). Perhaps this does not need to be super-hard:. An outline: (1) make the computation with log2(a)/k bits + eps extra bits (an extra limb would do at least if the limb are not too small), happily truncating intermediates. (2) at the end do two final exponentiations, one truncating towards 0 and one rounding towards +oo. These, keeping just log2(a)/k + eps high bits, should be at most 1 apart sort of. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel