Re: fast inversion

2015-05-21 Thread Torbjörn Granlund
bodr...@mail.dm.unipi.it writes: But it is not an inline function, it's a macro redefining mpn_com, it will not conflict with the prototype __gmpn_com. (I hope ;-) Thanks, it seems to have helped. I suppose this bug means that we didn't really provide mpn_com in the public interface.

Re: fast inversion

2015-05-19 Thread Torbjörn Granlund
There are new build failures which seem related to this change. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel

Re: fast inversion

2015-05-19 Thread Niels Möller
t...@gmplib.org (Torbjörn Granlund) writes: There are new build failures which seem related to this change. The declaration of mpn_com looks a bit fishy. It's conditionally declared in gmp-h.in, inside an #if __GMP_INLINE_PROTOTYPES || defined (__GMP_FORCE_mpn_com) But the inline definition

Re: fast inversion

2015-05-19 Thread bodrato
Ciao, Il Mar, 19 Maggio 2015 10:02 am, Niels Möller ha scritto: The declaration of mpn_com looks a bit fishy. It's conditionally declared in gmp-h.in, inside an #if __GMP_INLINE_PROTOTYPES || defined (__GMP_FORCE_mpn_com) But the inline definition is in gmp-impl.h, not gmp-h.in, so not

Re: fast inversion

2015-05-18 Thread bodrato
Ciao, I pushed Niels' code for mpn_neg. The old timings was: @shell ~/gmp-repo$ tune/speed -s 1-1030 -f 2 -c mpn_neg mpn_com mpn_add_1_inplace.1 overhead 6.78 cycles, precision 1 units of 2.86e-10 secs, CPU freq 3500.08 MHz mpn_neg mpn_com mpn_add_1_inplace.1 1

Re: fast inversion

2015-05-18 Thread bodrato
Ciao Paul, Il Lun, 18 Maggio 2015 11:33 am, paul zimmermann ha scritto: mpn_neg_n (tp, tp, n); should be mpn_neg instead? I have put this in Yes, of course. Anyway, in your code you should probably write: mpn_com (tp + l, tp + l, h); /* Amended the _n ;-) */ mpn_add_1 (tp +

Re: fast inversion

2015-05-18 Thread Torbjörn Granlund
bodr...@mail.dm.unipi.it writes: The new code is faster for n==1, slower for 2 = n = 4, and faster (more than twice) for n = 16. Nice speedup! In mpn/x86_64/fastsse/com.asm we have an mpn_com which will speed things up another 2x. It is not enabled on any platforms now as it needs

Re: fast inversion

2015-05-18 Thread Torbjörn Granlund
bodr...@mail.dm.unipi.it writes: @shell ~/gmp-repo$ tune/speed -s 1-1030 -f 2 -c mpn_neg mpn_com You might want to pass -p100 or somesuch to allow the CPU to speed up. (We might want to change the default, not sure to what.) -- Torbjörn Please encrypt, key id 0xC8601622

Re: fast inversion

2015-05-07 Thread bodrato
Ciao Paul, Il Lun, 27 Aprile 2015 4:37 pm, paul zimmermann ha scritto: Please tell me if you find a faster version of the code. Using tune/speed, the current code in GMP seems faster than the .c code you published. But this does not mean that the algorithm is faster. It depends, I suppose, on

mpn_neg (was: Re: fast inversion)

2015-04-29 Thread Niels Möller
t...@gmplib.org writes: Not sure why we're doing this as an inline function in gmp.h. Me neither. Perhaps as it is tiny? I think inlines in gmp.h makes sense for functions which are tiny and O(1) in the common case, like mpn_add_1 and mpn_cmp. Or small O(1) wrappers around O(n)

Re: fast inversion

2015-04-29 Thread Niels Möller
bodr...@mail.dm.unipi.it writes: mpn_neg id defined in gmp.h, where we can not check for HAVE_NATIVE, but I suspect that also the C version of mpn_com is faster than mpn_neg (in the former, limbs do not depend on one another, no carry propagation). I'd suggest something like this (totally

Re: fast inversion

2015-04-29 Thread Vincent Lefevre
On 2015-04-29 07:01:39 +0200, bodr...@mail.dm.unipi.it wrote: diff -r 6e11cd70e19e gmp-h.in --- a/gmp-h.inMon Apr 27 22:46:53 2015 +0200 +++ b/gmp-h.inTue Apr 28 22:41:52 2015 +0200 @@ -2191,13 +2191,10 @@ mp_limb_t mpn_neg (mp_ptr __gmp_rp, mp_srcptr __gmp_up, mp_size_t

Re: fast inversion

2015-04-29 Thread tg
ni...@lysator.liu.se (Niels Möller) writes: I'd suggest something like this (totally untested) mp_limb_t mpn_neg (mp_ptr rp, mp_srcptr up, mp_size_t n) { /* Low zero limbs are unchanged by negation. */ while (*up == 0) { *rp++ = 0; up++;

Re: fast inversion

2015-04-29 Thread tg
I just noticed that the fast mpn_com in x86_64/fastsse is not really used, making calls to mpn_com from the inlined mpn_neg perform poorly. I'll address this soon. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list

Re: fast inversion

2015-04-28 Thread bodrato
Ciao, Il Lun, 27 Aprile 2015 4:45 pm, t...@gmplib.org ha scritto: @shell ~/gmp-repo$ tune/speed -s 1-1030 -f 2 -c mpn_neg mpn_com mpn_add_1_inplace.1 overhead 6.78 cycles, precision 1 units of 2.86e-10 secs, CPU freq 3500.08 MHz mpn_neg mpn_com

Re: fast inversion

2015-04-27 Thread tg
bodr...@mail.dm.unipi.it writes: After a first glance to the code, two lines surprise me: mpn_com_n (tp, tp, n); mpn_add_1 (tp, tp, n, ONE); I wondered why you didn't use mpn_neg_n (tp, tp, n); Then I tested (on shell@gmplib) and... @shell ~/gmp-repo$ tune/speed