ni...@lysator.liu.se (Niels Möller) writes:
Here's a sketch of an adddmul_2 iteration using Karatsuba. I assume we
have vl, vh, vd = |vl - vh| and an appropriate sign vmask in registers
before the loop. Carry input in c0, c1, carry out in r2, r3.
mov (up), %rax
mov
ni...@lysator.liu.se (Niels Möller) writes:
One can decrease it a bit by adding c0, c1 earlier (do you think
recurrency can be a problem if we add c0, c1 to the first product?) and
doing an in-place add to (rp) and 8(rp) at the end.
I could get it down to 30 instructions with a deep
Torbjorn Granlund t...@gmplib.org writes:
In loopmixer or manually? I wouldn't draw any conclusions without
mixing the code first...
With the loop mixer.
Meaning evaluating in +1 instead of -1, I assume.
Exactly.
Did you compute the recurrency chain? Annotating the instructions on
the