David Miller <da...@davemloft.net> writes: Yes, understood. We have to transpose a few of the shifts with their neighbouring arithmetic ops in this loop to make it optimal for Ultra-I/II/IIi I found a powered up US2 and run time timing tests. No slowdown there for the new generic functions.
I suppose mpn/sparc64/ultrasparc1234/[lr]shift.asm are now redundant. Clearly, the new lshiftc code is not optimal for US1 through US4. It runs 0.5 c/l slower on them all, compared to what one would hope for 2-way unrolled code. -- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel