From: David Miller <da...@davemloft.net> Date: Tue, 02 Apr 2013 15:06:42 -0400 (EDT)
> With this corrected I tried dive_1.asm and several tests fail, I'll > try to figure out why. This turned out to be easy, you were using %o5 as a register for 'dinv' but this gets clobbered elsewhere in the code, using %o4 instead fixes the problems. Attached is a dive_1.asm that works for me on real hardware as well as T4 timings from: tune/speed -p10000000 -s1-1000 -f1.1 -C mpn_divexact_1.3
dive_1.asm
Description: Binary data
overhead 6.00 cycles, precision 10000000 units of 3.51e-10 secs, CPU freq 2847.41 MHz mpn_divexact_1.3 1 43.0004 2 26.9174 3 22.7782 4 20.5837 5 19.4004 6 20.1670 7 20.0004 8 20.2504 9 19.8338 10 19.7004 11 19.8186 12 20.0008 13 20.3851 14 20.7862 15 21.1338 16 21.4380 17 21.7067 18 21.9450 19 22.1584 20 22.3505 22 22.6828 24 22.9593 26 23.1933 28 23.3939 30 30.7669 33 29.5761 36 29.2781 39 29.0260 42 28.8098 46 28.5655 50 28.3603 55 28.1458 60 27.9670 66 27.7882 72 27.6392 79 27.4940 86 27.3724 94 27.2556 103 27.1459 113 27.0446 124 26.9519 136 26.8680 149 26.7922 163 26.7242 179 26.6595 196 26.6023 215 26.5491 236 26.5003 259 26.4559 284 26.4158 312 26.3785 343 26.3443 377 26.3133 414 26.2853 455 26.2596 500 26.2363 550 26.2148 605 26.1953 665 26.1778 731 26.1617 804 26.1471 884 26.1338 972 26.1218
_______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel