Now we have a nice set of x86_64 gcd_22. The code is not as well tuned
as the gcd_11 code, but it runs somewhat fast.
I haven't explored the table based variant which gives 3 bits of
progress per iteration. It might make the new code obsolete for
machines with fast multiply.
Now what? Should
Ciao,
Il Sab, 24 Agosto 2019 12:14 am, Torbjörn Granlund ha scritto:
> "Marco Bodrato" writes:
> It is not elegant, I agree, but maybe joining them both in a single
> .asm file, so that the jump is local?
>
> We might do that, but it makes things a lot more complicated as there
The