ni...@lysator.liu.se (Niels Möller) writes: I also had a quick look at the math, and I realized (some of you surely knew that already) that floor(sqrt(a)) is mostly independent of the lowest half of the bits of a.

Some of us indeed knew that... And this generalises to kth roots. It is similar to the more familiar problem of an m-limb by n-limb division; the low n dividend limbs affect the quotient very little... The lowest half of the bits are needed only for computing the remainder, and a final adjustment-by-one step. I think the iteration should totally ignore the low half bits in each step. Would it make sense with an mpn_sqrt_appr which takes a n-limb normalized number A as input, and computes an n-limb square root approximation of B^{n-1} A, with some well defined error bound, of at most 2? (We might also need a function for the the (n-1)-limb sqrt of B^{n-2} A). And a corresponding mpn_root_appr would compute the kth root of B^{(k-1) n - 1} A, I think. I haven't looked enough at your propposal to have an informed opinion. I think mpn_rootrem's approach for when the remainder is not needed might be wise, perhaps wiser. It computes with one extra limb, then uses that to almost always return the correct result without any (partial) remainder computation. To speed sqrt (non-rem) we might want to do like root. But both functions should probably do that only for operands over say 10 limbs. Below that, some _appr approach will be better. And as written before, as k grows, we would benefit enormously from a pow_1_highpart. The average speedup potential is O(k). -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel