Re: Sandybridge addmul_N challenge

2012-02-22 Thread Torbjorn Granlund
I doubt we can make addmul_1 run faster on sandybridge. But I'd like mul_basecase to run much faster than 3 c/l. Then sqr_basecase and redc_1, redc_2 should be fixed. An addmul_2 running better at 3 c/l or better would be great. That means we need to handle a tick in it using = 17 insns,

_mp_alloc vs ALLOC

2012-02-22 Thread Marc Glisse
Hello, is there any objection if I replace most uses of -_mp_alloc by calls to the ALLOC macro in mp[zqf] (and similarly for _mp_size, etc)? It helps when experimenting... I am also considering moving the NUM and DEN macros from test/mpq/t-cmp* to gmp-impl.h, since I assume mpq_numref and

Re: _mp_alloc vs ALLOC

2012-02-22 Thread Torbjorn Granlund
Marc Glisse marc.gli...@inria.fr writes: is there any objection if I replace most uses of -_mp_alloc by calls to the ALLOC macro in mp[zqf] (and similarly for _mp_size, etc)? It helps when experimenting... I am also considering moving the NUM and DEN macros from test/mpq/t-cmp* to

Re: _mp_alloc vs ALLOC

2012-02-22 Thread bodrato
Ciao, Il Mer, 22 Febbraio 2012 7:41 pm, Torbjorn Granlund ha scritto: Marc Glisse marc.gli...@inria.fr writes: their length. By the way, is there any difference between PTR and LIMBS? Say one that should be used in some circumstances and one in others? You're welcome to clean up

Re: _mp_alloc vs ALLOC

2012-02-22 Thread Torbjorn Granlund
bodr...@mail.dm.unipi.it writes: Unrelated :-) We might define more macros like TMP_ALLOC_LIMBS_2 . I mean _3 and _4. So that they can be used to reduce the number of allocations. Do you agree? (I just touched mpz/gcdext.c, and _4 should be used there). I'd vote for killing

Re: _mp_alloc vs ALLOC

2012-02-22 Thread Niels Möller
Torbjorn Granlund t...@gmplib.org writes: TMP_ALLOC_LIMBS_2 is clutter IMHO. Sure, it's pointless in a normal build. As I understand it, the reason for having TMP_ALLOC_LIMBS_2 is to make --enable-alloca=debug more effective, by getting some kind of red zone separating the two areas. Whether

Re: _mp_alloc vs ALLOC

2012-02-22 Thread Marc Glisse
On Wed, 22 Feb 2012, Torbjorn Granlund wrote: bodr...@mail.dm.unipi.it writes: Unrelated :-) We might define more macros like TMP_ALLOC_LIMBS_2 . I mean _3 and _4. So that they can be used to reduce the number of allocations. Do you agree? (I just touched mpz/gcdext.c, and _4 should be used

Re: _mp_alloc vs ALLOC

2012-02-22 Thread Torbjorn Granlund
Marc Glisse marc.gli...@inria.fr writes: That's for the alloca case. Without alloca, one call to malloc is better than two (although that usually also means the numbers are big and any gmp operation will dwarf allocation). Also, the threshold between alloca and malloc is quite high, and