David Miller <da...@davemloft.net> writes: While waiting for the FSF to execute my assignment, I tweaked my existing 2-way unrolled mul_1 and addmul_1 loops. Currently on T4 I'm at: mul_1 3.8 cycles/limb addmul_1 5.5 cycles/limb Nice progress! I still recommend 4-way unrolling for at least the most critical functions. :-)
-- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel