Re: question on fixing the mpz_sizeinbase()

2024-03-17 Thread marco . bodrato
Ciao, I answer here to a question from gmp-discuss. 17 mar 2024, 01:28 da t...@gmplib.org: > "website.reader" writes: > > Here's the suggested fix for this command in a C++ code unit > This question arises once every couple of years, more or less... Should we write a new

Re: Has there been historical work done to investigate small integer optimization?

2024-02-12 Thread marco . bodrato
Ciao, 12 feb 2024, 11:54 da paul.zimmerm...@inria.fr: > as Niels said, I fear the cost of the tests might make the benefit vanish. > But to be sure, the only way is to try to implement this idea inside GMP, > which involves quite some work. > But implementing it with the current mpz type is

Re: mpn_mulhigh_basecase for Broadwell

2024-02-11 Thread marco . bodrato
Thank you Albin, currently we don't have any mulhi function, so we didn't decide a possible interface for it. There are some comments in the code where a full mul is used, but some "mulhi" function could be enough. Many of them seem to use unbalanced operands, and might require the exact

Re: Tuneup using size_ratio

2023-12-27 Thread marco . bodrato
Ciao, 23 dic 2023, 21:02 da ni...@lysator.liu.se: > marco.bodr...@tutanota.com writes: > >> But if I measure the speed with the ratio 0.7 or 0.8 I get quite >> different thresholds, which one should we use? >> > Maybe crossover is rather flat, i.e., a wide range with very small > difference in

Re: [PATCH] Fix typo in z13 bdiv_dbm1c.

2023-10-12 Thread marco . bodrato
nction. Contributed to the GNU project by Marco Bodrato. Copyright 2023 Free Software Foundation, Inc. This file is part of the GNU MP Library test suite. The GNU MP Library test suite is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License

gcc -fanalyzer

2023-09-16 Thread marco . bodrato
Ciao, gcc complains about missing call to 'va_end' in scanf/doscan.c and printf/doprnt.c . man stdarg on my Debian reads: "Each  invocation of va_copy() must be matched by a corresponding invocation of va_end() in the same function." So that probably gcc is right. I assume we never noticed,

Re: Fast constant-time gcd computation and modular inversion

2022-09-03 Thread Marco Bodrato
Il 2022-09-01 17:04 Torbjörn Granlund ha scritto: /* FIXME: Using mpz_invert is cheating. Instead, first compute m' = m^-1 (mod 2^k) via Newton/Hensel. We can then get the inverse via 2^{-k} (mod m) = (2^k - m') * m + 1)/2^k. */ mpz_invert (t, t, m); mpn_copyi (info->ip,

Re: More accurate calculation of sieve array size

2022-08-11 Thread Marco Bodrato
Ciao Adrien, Il 2022-08-11 17:43 Adrien Prost-Boucle ha scritto: For primorial, when thinking on the code I considered that the call to mpz_prodlimbs (x, factors, j) is the one responsible for (re)sizing x to whatever appropriate for the result. Indeed, prodlimbs can resize the mpz_t to

Re: More accurate calculation of sieve array size

2022-08-11 Thread Marco Bodrato
Ciao Adrien, Il 2022-08-10 15:37 Adrien Prost-Boucle ha scritto: Looking into oddfac and primorial computation code, I think there is a suboptimality in the computation of the prime sieve size. Could anyone confirm? Maybe there are some too-conservative assumptions, but we must be careful

Re: Use of AVX instructions in mpn_mul_1

2022-06-17 Thread Marco Bodrato
Ciao Thanassis, Il 2022-06-13 23:17 Thanassis Tsiodras ha scritto: I had a quick look at the x86_64 assembly implementations of the basic primitive used in multiplications (mpn_mul_1), and saw this: ...I could not find any use of AVX-integer-related multiplication instructions. I am talking

Re: Squaring optimization in mpz_addmul/mpz_submul

2022-05-29 Thread Marco Bodrato
Ciao Fredrik, Il 2022-05-25 22:48 Fredrik Johansson ha scritto: Now that mpn_mul no longer dispatches to mpn_sqr, the squaring optimization is missing in mpz_addmul/mpz_submul. Thanks for spotting this! mpz_addmul(sum, a, a); This should be easy to fix. It is. I fixed it with a

Re: stronglucas.c

2022-05-29 Thread Marco Bodrato
Ciao, Il 2022-05-18 19:32 Marco Bodrato ha scritto: Il Mer, 18 Maggio 2022 7:48 am, Niels Möller ha scritto: Seth Troisi writes: I was reading the stronglucas code Great! -if (((*PTR (n) & 6) == 0) && UNLIKELY (mpz_perfect_square_p (n))) Our test-suite did not trigger

Re: stronglucas.c

2022-05-18 Thread Marco Bodrato
Ciao, Il Mer, 18 Maggio 2022 7:48 am, Niels Möller ha scritto: > Seth Troisi writes: >> I was reading the stronglucas code Great! >> diff -r 970b7221873f mpz/stronglucas.c >> /* n is odd, to possibly be a square, n % 8 = 1 is needed. */ >> -if (((*PTR (n) & 6) == 0) && UNLIKELY

Re: about prime number generation

2022-05-14 Thread Marco Bodrato
Il 2022-05-01 17:01 Marco Bodrato ha scritto: Il 2022-04-30 23:39 Seth Troisi ha scritto: next_prime took up all of my energy We should further improve that work! With your improvements to nextprime, it's really a waste of resources if ones have to write for (int i = 0; i < delta2

Re: Use on small mulmod_bnm1 [was: New mulmod_bknp1]

2022-05-01 Thread Marco Bodrato
Ciao Niels, for some reason I did not receive your answer. I took the text from the archive. Il gio, 21 aprile 2022 08:15 am, Niels Möller ha scritto: > Marco Bodrato writes: >> Yes, with a larger expected gain for 3, and a smaller one for 13, or 17. > > And in your table, a

Re: about prime number generation

2022-05-01 Thread Marco Bodrato
Ciao Seth, Il 2022-04-30 23:39 Seth Troisi ha scritto: I worked on nth_prime(n) in a series of patches three years ago That code was very interesting. It would be nice to get rid of the double type and the function log(). The library is not using math.h and libm anywhere now. But it doesn't

Re: Use on small mulmod_bnm1 [was: New mulmod_bknp1]

2022-04-20 Thread Marco Bodrato
Ciao, Il 2022-04-19 21:03 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: In the 128-2048 range (at least on that machine: shell.gmplib.org) the sizes multiple of 12, 24, 48... should be preferred... I don't fully understand this, but if I got it right, we want the size n

Use on large mul_fft [was: New mulmod_bknp1]

2022-04-16 Thread Marco Bodrato
Il 2022-04-15 19:56 Marco Bodrato ha scritto: Ciao, Il 2022-02-15 11:48 Marco Bodrato ha scritto: I pushed some new functions to compute the product (and square) modulo B^nn+1, for small values of the exponent nn. Currently that code is used by two functions. One is mulmod_bnm1

Re: mul_fft, cleaning up some details of the code

2022-03-08 Thread Marco Bodrato
Ciao Paul, Il 2022-03-08 16:20 Paul Zimmermann ha scritto: Uhm, the last line of code just before that ones is: cc = mpn_sub_1 (r, r, m, cc) + 1; /* add 1 to cc instead of rd since rd might overflow */ it seems you are right! Well, I pushed a clean-up for that portion of the code

Re: mul_fft, cleaning up some details of the code

2022-03-08 Thread Marco Bodrato
Ciao Paul, Il 2022-03-08 12:56 Paul Zimmermann ha scritto: Since you deeply know how this code works, I ask you one more question. The last lines of the function mpn_fft_mul_2exp_modF (branch m < n) contain: /* now subtract cc and rd from r[m..n] */ r[n] = -mpn_sub_1 (r + m, r + m, n -

Re: mul_fft, cleaning up some details of the code

2022-03-08 Thread Marco Bodrato
Ciao Paul, Il 2022-03-07 10:28 Paul Zimmermann ha scritto: Date: Sun, 06 Mar 2022 11:14:49 +0100 From: Marco Bodrato Specifically I'd focus into a suspicious piece of code, shared by both our current code and Vivien's. if (cy >= 0) cy = mpn_add_1 (tmp, tmp, Kl,

Re: mul_fft, cleaning up some details of the code

2022-03-06 Thread Marco Bodrato
Ciao, Il 2022-03-06 12:16 Torbjörn Granlund ha scritto: Marco Bodrato writes: The comment before the mpn_mul_fft_decompose function says "We must have nl <= 2*K*l", this means that we should never have ((dif = nl - Kl) > Kl), and the code in that branch should nev

mul_fft, cleaning up some details of the code

2022-03-06 Thread Marco Bodrato
Ciao, I looked into the code published by Samuel Vivien. I realised that in mul_fft there are a lot of _add_1 and _sub_1. At least some of them can be easily replaced by _INCR_U or _DECR_U... Specifically I'd focus into a suspicious piece of code, shared by both our current code and

Re: New mulmod_bknp1

2022-03-04 Thread Marco Bodrato
Ciao, Il 2022-02-23 12:35 Marco Bodrato ha scritto: ... then maybe I was wrong when I wrote that we should not trade a factor 3 with a factor 2... Yes, I think I was wrong. I pushed to the repository the use of the _bknp1 (mod B^kn+1) code on the +1 side of the _bnm1 multiplication code

Re: binvert_limb speedup on 64 bit machines with UHWtype

2022-03-01 Thread Marco Bodrato
Ciao, Il 2022-02-27 16:52 Marco Bodrato ha scritto: Il 2022-02-25 17:06 John Gatrell ha scritto: I tested using UHWtype in the macro for binvert_limb. On a 64 bit machine one of my programs gained a 3% speedup. On a 32 bit machine, there was no Should we use uint8_fast_t, uint16_fast_t

Re: binvert_limb speedup on 64 bit machines with UHWtype

2022-02-27 Thread Marco Bodrato
Ciao, Il 2022-02-27 18:13 Marco Bodrato ha scritto: I did never look into that file :-) I inserted there a few more versions of binvert_limb. The attached patch is only for testing, not to be pushed (I used uint_fast##_t types). On shell I get the following: @shell ~/gmp-repo]$ /var/tmp

Re: binvert_limb speedup on 64 bit machines with UHWtype

2022-02-27 Thread Marco Bodrato
Ciao Torbjörn, Il 2022-02-27 10:53 Torbjörn Granlund ha scritto: Perhaps one could write it (n & 0xff)/2 and get better code Actually, this trick is already used! In tune/modlinv.c :-/ The following line dates back to 2002: __inv = binvert_limb_table[(__n&0xFF)/2]; /* 8 */ \ I

Re: binvert_limb speedup on 64 bit machines with UHWtype

2022-02-27 Thread Marco Bodrato
Ciao John Gartell, Il 2022-02-25 17:06 John Gatrell ha scritto: Hi everyone. I'm new here so I don't know how to submit a new gmp-impl.h This list is the correct place for such a discussion. I tested using UHWtype in the macro for binvert_limb. On a 64 bit machine one of my programs gained

Re: New mulmod_bknp1

2022-02-23 Thread Marco Bodrato
Ciao Niels and David, Il 2022-02-23 09:14 ni...@lysator.liu.se ha scritto: N^3-1 = (N^2+N+1)/3 * 3^{k+1} * (N-1)/3^k Could be implemented as one multiply each mod 2^N+N+1 and (N-1), followed by reduction mod (N^2+N+1)/3 and (N-1)/3^k at the end; these reductions correspond to small quotients

Re: New mulmod_bknp1

2022-02-22 Thread Marco Bodrato
Ciao David, Il Mar, 22 Febbraio 2022 10:55 pm, David Harvey ha scritto: > On Tue, 2022-02-22 at 22:39 +0100, Marco Bodrato wrote: >> > E.g, in this case we could try a top-level B^66 - 1 product, split in >> > B^33+1 and B^33-1; then the former suits your new algorithm well

Re: New mulmod_bknp1

2022-02-22 Thread Marco Bodrato
Ciao, Il Mar, 22 Febbraio 2022 8:04 pm, Niels Möller ha scritto: > Marco Bodrato writes: >> Simply, if a multiplication mod B^{3n}+1 is needed, the code computes >> - a product mod B^{n}+1 >> - a product mod B^{2n}-B^{n}+1 >> - with CRT, the desired result

Re: New mulmod_bknp1

2022-02-22 Thread Marco Bodrato
Ciao, Il 2022-02-21 01:37 Torbjörn Granlund ha scritto: I am too busy to examine the code to see what you've done. Perhaps you could outline the algorithms here? Nothing special... Simply, if a multiplication mod B^{3n}+1 is needed, the code computes - a product mod B^{n}+1 - a product mod

Re: New mulmod_bknp1

2022-02-18 Thread Marco Bodrato
Ciao, Il 2022-02-15 11:48 Marco Bodrato ha scritto: I pushed some new functions to compute the product (and square) modulo B^nn+1, for small values of the exponent nn. For large values the already available fft_mul should be used. A possible use is replacing the last-level point-wise

New mulmod_bknp1

2022-02-15 Thread Marco Bodrato
be supported, but it's currently disabled. I added also tune/speed support, but the problem with the analysis is that the function is defined for some values only... Here are some results: @shell ~/gmp-repo]$ /var/tmp/bodrato/gmp/hg/build/tune/speed -Cr -s 75-85\,240-250\,725-735 mpn_mul_n

Re: mpq_mul_ui

2022-02-08 Thread Marco Bodrato
Ciao, Il 2022-01-23 01:34 Marc Glisse ha scritto: Hello, What would you think of adding mpq_mul_ui, mpq_div_ui, mpq_ui_div, and also the _z versions? I tried to think to the mini-mpq version of the _z flavor. diff -r ed0406cf3c70 mini-gmp/mini-mpq.c --- a/mini-gmp/mini-mpq.c Wed Feb

Re: Error handler for GMP?

2022-02-01 Thread Marco Bodrato
Ciao, Il 2021-06-12 16:49 Marco Bodrato ha scritto: Anyway, for internal coherence, I think that we should at least check that all the functions in the library that might "abort" do this using an "exception". For this, I propose the attached patch. That patch ("Ha

Re: mpq_mul_ui

2022-01-31 Thread Marco Bodrato
Ciao, Il Mer, 26 Gennaio 2022 7:53 am, Marco Bodrato ha scritto: > Il 2022-01-24 10:04 Torbjörn Granlund ha scritto: >> Marc Glisse writes: >> >> What would you think of adding mpq_mul_ui, mpq_div_ui, mpq_ui_div, >> and >> also the _z versions? >> >

Re: mpq_mul_ui

2022-01-25 Thread Marco Bodrato
Ciao, Il 2022-01-24 10:04 Torbjörn Granlund ha scritto: Marc Glisse writes: What would you think of adding mpq_mul_ui, mpq_div_ui, mpq_ui_div, and also the _z versions? That would make sense to me. I agree too, I just remember an observation about the naming convention:

Re: proposal to change the return type/value of mpz_primorial_ui

2021-11-14 Thread Marco Bodrato
- Returns the number of primes in the range [1..N]. Contributed to the GNU project by Marco Bodrato. THE FUNCTION IN THIS FILE IS INTERNAL WITH A MUTABLE INTERFACE. IT IS ONLY SAFE TO REACH IT THROUGH DOCUMENTED INTERFACES. IN FACT, IT IS ALMOST GUARANTEED THAT IT WILL CHANGE OR DISAPPEAR IN A FUTURE G

Re: proposal to change the return type/value of mpz_primorial_ui

2021-11-11 Thread Marco Bodrato
Ciao, Il 2021-11-08 18:57 Marco Bodrato ha scritto: I got inspired by a piece of code that Seth Troisi sent a couple of years ago and I wrote a gmp_pi_ui(n) function with two variants, a trivial one (it calls gmp_primesieve) and an implementation of the Legendre strategy. I attach

Re: Doc tweak

2021-11-09 Thread Marco Bodrato
Ciao Marc, Il 2021-11-06 13:35 Marc Glisse ha scritto: someone got confused by the text saying that all the GMP declarations are in gmp.h and thought the C++ class interface would be there as well, so I'll probably add this patch unless someone has a better proposition. I agree. Ĝis, m

Re: proposal to change the return type/value of mpz_primorial_ui

2021-11-08 Thread Marco Bodrato
Ciao, Shane, On Sat, Nov 6, 2021 at 1:31 PM Marco Bodrato wrote: Il 2021-11-03 19:18 Shane Neph ha scritto: > mpz_primorial_ui( ) is a great function. Why not return the actual > number of primes that go into that calculation? It may make sense, but only if we add also another fu

Re: proposal to change the return type/value of mpz_primorial_ui

2021-11-06 Thread Marco Bodrato
Ciao, Il 2021-11-03 19:18 Shane Neph ha scritto: mpz_primorial_ui( ) is a great function. Why not return the actual number of primes that go into that calculation? It may make sense, but only if we add also another function: a way to compute the number of primes in a range; or at least in

Re: [Request for comments] Potential room for speedup when calculating divmod() of bases with many trailing 0 bits (in binary)

2021-11-01 Thread Marco Bodrato
Ciao, Il 2020-09-21 17:30 Marco Bodrato ha scritto: Il 2020-09-14 18:50 Shlomi Fish ha scritto: I was able to improve upon mpz_mod here: This is in case the number that one divides by is itself divisible by a large power of 2. There are many special forms for the divisors that can

Re: Suggested tune/tuneup.c patch

2021-11-01 Thread Marco Bodrato
Ciao, Il 2021-11-01 01:04 Torbjörn Granlund ha scritto: Marco Bodrato writes: fac_ui... and I find an empty page if I look at https://gmplib.org/devel/thres/2021-10-30/FAC_ODD_THRESHOLD Still there, but for some reason nginx had stopped serving gz pages, and there were no non-gzip

Re: Suggested tune/tuneup.c patch

2021-10-31 Thread Marco Bodrato
Ciao, Il 2019-09-11 19:37 t...@gmplib.org ha scritto: I added a history preservation feature to the .../devel/thres/ pages. At 23:59 each night, all pages are copied to .../devel/thres/-MM-DD. (There is no index, one needs to type in the wanted date manually.) Is this still active? I

Re: Please update addaddmul_1msb0.asm to support ABI in mingw64

2021-10-06 Thread Marco Bodrato
Ciao Niels, Il 2021-10-06 21:54 ni...@lysator.liu.se ha scritto: Now, question is if it can beat mul_1 + addmul_1. I don't know. Well, the question is also if the current code beats mul_1 + addmul_1 :-) Ĝis, m ___ gmp-devel mailing list

Re: Please update addaddmul_1msb0.asm to support ABI in mingw64

2021-10-06 Thread Marco Bodrato
Ciao, Il 2021-10-06 17:21 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: I agree. Let's choose between the two last versions, and I'll push it. Nice with a few instructions less, feel free to push the version you like best. I pushed the last one: https://gmplib.org/repo/gmp/rev

Re: Please update addaddmul_1msb0.asm to support ABI in mingw64

2021-10-06 Thread Marco Bodrato
Ciao Niels, Il 2021-10-05 20:56 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: mov %r11, -16(rp) mov %r10, %r11 jmp L(one) I had hoped this jump and preceding instructions could be eliminated, to get a structure like ja L(two

Re: Please update addaddmul_1msb0.asm to support ABI in mingw64

2021-10-05 Thread Marco Bodrato
Il 2021-10-05 07:38 ni...@lysator.liu.se ha scritto: Some potential further improvements: It would likely be possible to order the cases at the end as L(nul), L(one), L(two), and let the nul case fall through into the one case, reducing the size a bit. And a mulx version could likely eliminate

Re: Please update addaddmul_1msb0.asm to support ABI in mingw64

2021-10-04 Thread Marco Bodrato
changes? 8<- *** /tmp/extdiff.X4CyBq/gmp-err.b03b3cb1ce98/mpn/x86_64/addaddmul_1msb0.asm 2020-11-16 21:41:15.0 +0100 --- /home/bodrato/src/gmp-err/mpn/x86_64/addaddmul_1msb0.asm 2020-10-29 04:37:12.369375684 +0100 *** define(`u0',`%r8') *** 51

Re: div_qr_1n_pi1

2021-06-14 Thread Marco Bodrato
Ciao, Il 2021-06-06 22:16 Torbjörn Granlund ha scritto: ni...@lysator.liu.se (Niels Möller) writes: Maybe we should have some macrology for that? Or do all relevant processors and compilers support efficient cmov these days? I'm sticking to masking expressions for now. Let's not trust

Re: Error handler for GMP?

2021-06-12 Thread Marco Bodrato
Ciao John! È un piacere vederti da queste parti! Il 2021-03-22 09:55 abb...@dima.unige.it ha scritto: Does GMP offer a way to return/throw rather than simply aborting upon overflow? No, it doesn't. Yet. I could not see anything relevant in the documentation. The theme emerges every now

Re: pointers vs arrays

2021-06-09 Thread Marco Bodrato
Ciao, Il 2021-05-12 15:17 Marc Glisse ha scritto: the latest version of gcc has a new (questionable) warning that complains if a function is declared as taking a mpz_t parameter and redeclared as taking mpz_ptr, or the reverse. We might as well be consistent and use pointers everywhere, as in

Re: div_qr_1n_pi1

2021-06-03 Thread Marco Bodrato
Ciao, Il 2021-06-03 12:40 Torbjörn Granlund ha scritto: If we dare use cmov (and its presumed side-channel leakage) we could probably shorten the critical path by a cycle. The "sbb" and "and" would go away. Using masks does not always give the fastest code. I tried the following variation

Re: Side channel silent karatsuba / mpn_addmul_2 karatsuba

2020-12-08 Thread Marco Bodrato
project by Torbjorn Granlund. Modified by Marco BODRATO to obtain a side-channel silent version. Copyright 2006-2010, 2012, 2014, 2018, 2020 Free Software Foundation, Inc. This code is free software; you can redistribute it and/or modify it under the terms of the GNU Affero General Public License

Re: mpz_prevprime

2020-11-28 Thread Marco Bodrato
Ciao, Il 2020-11-16 21:23 Seth Troisi ha scritto: Thanks Marco! I'm really happy this is merged. Your code was merged with a storm of new code that has been reversed. But now it's in again! Was there any specific things you thought could be improved? Happy to look at those too. Memory

Re: mpz_prevprime

2020-11-16 Thread Marco Bodrato
Ciao, Il 2020-10-16 09:51 Seth Troisi ha scritto: I included a patch with the rename but not including Marco's tdiv I pushed most of the changes you proposed. I only removed some portion of the proposed tests/mpz/t-nextprime.c. You are right, also the return value of the new function should

Re: 6.2.0 build failure on x86_64 Skylake PC - FIX

2020-10-25 Thread Marco Bodrato
Ciao, Il 2020-10-15 07:14 Marco Bodrato ha scritto: we are not changing much the gmp-6.2 branch, and we should release. Is there something important we are missing as a bug fix? I searched the gmp-devel mailing list and we missed to take any action after the following, about msys. It seems

Re: mpz_prevprime

2020-10-16 Thread Marco Bodrato
Ciao, Il 2020-10-16 09:51 Seth Troisi ha scritto: won't this cause m = 2*prime, instead of the original result, m = 0? I used m = (diff > 0 && m) ? prime - m : m; to make sure that p can be marked as composite. Oops, you are right! I fear that the short-circuit evaluation of && can force the

Re: mpz_prevprime

2020-10-16 Thread Marco Bodrato
Post scriptum: Il Ven, 16 Ottobre 2020 7:13 am, Marco Bodrato ha scritto: >m = mpz_tdiv_ui(p, prime); >m = (diff > 0) ? prime - m : m; I remember that, in this context, if p = 0 (mod prime), the result m = prime is as good as the result m = 0. Because the next two lines are:

Re: mpz_prevprime

2020-10-15 Thread Marco Bodrato
Ciao, Il Ven, 16 Ottobre 2020 1:47 am, Seth Troisi ha scritto: > On Thu, Oct 15, 2020 at 10:44 AM Niels Möller >> Seth Troisi writes: >> > Technically I can restructure the code to make both use with >> mpz_fdiv_ui >> > static int >> > findnext (mpz_ptr p, >> > - unsigned

Re: mpz_prevprime

2020-10-03 Thread Marco Bodrato
Ciao, Il 2020-10-03 03:58 Seth Troisi ha scritto: On Thu, Mar 26, 2020 at 2:00 PM Marco Bodrato Il 2020-03-25 02:25 Seth Troisi ha scritto: + t = diff > 0 ? ((t + 1) | (t > 1)) : + ((t == 3) ? 2 : ((t - 2) | 1)); Maybe move this to the caller side? Or partially, leaving here just ASS

Re: mpz_prevprime

2020-10-03 Thread Marco Bodrato
Ciao, Il 2020-10-03 03:58 Seth Troisi ha scritto: I modified the patch a tiny bit. Still hoping to get this in for an I think that the patch is interesting: a function for searching primes backward in the sequence of integers is missing and seems useful. The proposed interface is the

Re: mpn_set_str_bits

2020-09-30 Thread Marco Bodrato
Ciao, Il 2020-09-30 19:37 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: limb = (unsigned int) sp[sn] >> (bits - shift); That's easier to read than what I proposed. Maybe worth a comment mentioning the problem case: mp_limb_t Thanks Niels! So, here is the current pr

Re: mpn_set_str_bits

2020-09-30 Thread Marco Bodrato
Ciao! Il 2020-09-30 09:03 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: limb = sp[sn]; if (GMP_LIMB_BITS > CHAR_BIT || shift > 0) limb >>= bits - shift; else limb = 0; Do we really need to support bits == GMP_LI

Re: mpn_set_str_bits

2020-09-29 Thread Marco Bodrato
Ciao, Il 2020-09-29 12:15 Raphael Rieu-Helft ha scritto: The attached patch slightly improves the mini-gmp function mpn_set_str_bits. The invariants are also a bit clearer (shift is the The loop in mpz_import uses another strategy, a temporary limb. This reduces the number of write operations

Re: State of PRNG code in GMP

2020-06-02 Thread Marco Bodrato
Ciao, Il 2020-06-02 11:22 t...@gmplib.org ha scritto: I'd like to have a coherent interface which also provide reentrant random functions to the mpn layer. And in no case should mpn pull in mpz. Makes sense. Unfortunately, Mersenne Twister uses mpz, and I am not saying that that was a bad

Re: mini-gmp warnings

2020-04-26 Thread Marco Bodrato
Il 2020-04-26 16:22 ni...@lysator.liu.se ha scritto: Is there an easy way to run mini-gmp tests with small limb size? In mini-gmp/mini-gmp.h we have the following lines: #ifndef MINI_GMP_LIMB_TYPE #define MINI_GMP_LIMB_TYPE long #endif typedef unsigned MINI_GMP_LIMB_TYPE mp_limb_t; So, you

Re: mini-gmp warnings

2020-04-26 Thread Marco Bodrato
Ciao, Il 2020-04-26 15:11 ni...@lysator.liu.se ha scritto: ni...@lysator.liu.se (Niels Möller) writes: There are a couple of other uses of LOCAL_SHIFT_BITS, LOCAL_GMP_LIMB_BITS, LOCAL_CHAR_BIT, added as part of the support for testing with small limb sizes. Is it for some reason important to

Re: mini-gmp "bug" missing mpz_fits_sint_p / mpz_fits_uint_p

2020-04-20 Thread Marco Bodrato
Ciao, Il 2020-04-20 11:08 Vincent Lefevre ha scritto: I think that in general, you should not write code that depends on whether INT_MAX + INT_MIN == 0 or not (the constant INT_MAX + INT_MIN might be useful in some rare cases, but I think that testing whether this constant is 0 or not should be

Re: mini-gmp "bug" missing mpz_fits_sint_p / mpz_fits_uint_p

2020-04-19 Thread Marco Bodrato
Il 2020-04-19 10:44 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: +int +mpz_fits_sint_p (const mpz_t u) +{ + return (INT_MAX + INT_MIN == 0 || mpz_cmp_ui (u, INT_MAX) <= 0) && +mpz_cmpabs_ui (u, GMP_NEG_CAST (unsigned long int, INT_MIN)) <= 0;

Re: mini-gmp "bug" missing mpz_fits_sint_p / mpz_fits_uint_p

2020-04-19 Thread Marco Bodrato
Ciao, On the gmp-discuss list, Il 2020-04-11 21:41 Simon Sobisch ha scritto: mini-gmp provides mpz_fits_slong_p ad mpz_fits_uslingt_p, but it does not provide the same for smaller integer types. We can easily add the requested functions. I suggest the following code: diff -r 805304ca965a

Re: possible speedup for mpz_nextprime_small

2020-04-03 Thread Marco Bodrato
Il 2020-03-27 21:36 Seth Troisi ha scritto: note I didn't get the time to look other the patch so it might be extra rought. Messaggio originale Oggetto: Re: Cleanup mpz_out_str in tests Il 2020-03-26 10:10 Seth Troisi ha scritto: "Why" today was that I had a little time

Proposed patch for mpn_trialdiv

2020-04-03 Thread Marco Bodrato
Ciao, the current code in mpn/generic/trialdiv.c contain a comment that suggest to remove "one of the outer loop conditions". The patch I attach removes it by checking only the number of primes, and limiting the maximum for the latter. I also added a possible optimisation for a single

Re: mpz_prevprime

2020-03-26 Thread Marco Bodrato
Ciao, Il 2020-03-25 02:25 Seth Troisi ha scritto: I'm back with a new mpz_prevprime patch diff -r 805304ca965a doc/gmp.texi +@deftypefun int mpz_prevprime (mpz_t @var{rop}, const mpz_t @var{op}) +If previous prime doesn't exist (e.g. @var{op} < 3), rop is unchanged and +0 is returned. I

Re: possible speedup for mpz_nextprime_small

2020-03-26 Thread Marco Bodrato
Ciao, Il 2020-03-26 01:06 Seth Troisi ha scritto: Marco was interested in what using dynamically allocating primes would look like. I've attached a couple of screenshots. I'd like to understand the reason why the line "dynamic" is 5-10% lower than the other lines in the range above 27...

Re: Cleanup mpz_out_str in tests

2020-03-26 Thread Marco Bodrato
Ciao, Il 2020-03-26 02:15 Seth Troisi ha scritto: This cleans up a number of printf(...) mpz_out_str(stdout, 10/16, var); printf(...); and replaces them with gmp_printf(...%Zd..., var); Why? Is the current code not working? In mini-gmp we have mpz_out_str, but we don't have gmp_printf.

Re: mpz_prevprime

2020-03-24 Thread Marco Bodrato
Ciao, Il 2020-03-24 18:54 Seth Troisi ha scritto: On Tue, Mar 24, 2020 at 9:56 AM Marco Bodrato wrote: I propose a variation of your patch, you find it attached, on my computer it's faster. Couple of small notes but otherwise this looks good +/* NP_SMALL_LIMIT = prevprime (LAST_PRIME ^ 2

Re: mpz_prevprime

2020-03-24 Thread Marco Bodrato
complex, it may be interesting to test if it is worth. On Mon, Mar 23, 2020 at 2:38 PM Marco Bodrato wrote: It might also be useful to commit tune_nextprime_small which dynamically selects a higher reps count for mpz_nextprime input and helps increase precision. Uhm... j = 20 / s->s

Re: mpz_prevprime

2020-03-23 Thread Marco Bodrato
Il 2020-03-23 22:38 Marco Bodrato ha scritto: Using unsigned long to compute the modulus is obviously faster than many calls to mpz_cdiv_ui. I tried a patch that surely is not as fast as yours, but should speed-up all numbers that fits in a single unsigned long. The speed-up seems small, I'm

Re: mpz_prevprime

2020-03-23 Thread Marco Bodrato
Ciao, Il 2020-03-22 10:00 Seth Troisi ha scritto: I measure the regression as ~5-15% for input < ~10,000 And no regression for larger input? Good. I have a proposed patch which uses int, and trial div up to some threshold this appears to be 5x faster. A couple of question arises. 5x

Re: mpz_prevprime

2020-03-21 Thread Marco Bodrato
Il 2020-03-21 11:42 Seth Troisi ha scritto: I see this was pushed, with a few more polishing tweaks. Yes, but there are bad news, it introduces a speed regression for small numbers. I see you also added more testing in tests/devel/primes.c which is a great triple check. It looks like the

Re: mpz_prevprime

2020-03-17 Thread Marco Bodrato
Ciao, Il 2020-03-16 02:23 Seth Troisi ha scritto: On Sun, Mar 15, 2020 at 4:38 PM Marco Bodrato wrote: May I write a few more comments? Always, my opinion about being ready is just that it's passed the bar of being good enough not that it's perfect. I'm tempted to simply commit your

Re: Moving LOOP_ON_SIEVE_* macros to gmp-impl.h

2020-03-17 Thread Marco Bodrato
Ciao, Il 2020-03-16 04:36 Seth Troisi ha scritto: Per Marco's comments in my prev_prime/next_prime patch I moved some of the primesieve macros and functions to gmp-impl.h There are two reasons why I never moved those functions and macros to gmp-impl.h, two aspects of one problem: the

Re: mpz_prevprime

2020-03-15 Thread Marco Bodrato
Ciao, Il 2020-03-15 00:07 Seth Troisi ha scritto: New patch which is cleaned up and IMO ready to commit May I write a few more comments? On Mon, Mar 9, 2020 at 5:15 PM Seth Troisi wrote: I also dislike the boiler plate of the macros but I didn't Those macros should probably be moved

Re: [PATCH] mini-gmp: pass correct old_size to custom reallocate function

2020-03-09 Thread Marco Bodrato
Ciao, Il Lun, 9 Marzo 2020 12:30 pm, Niels Möller ha scritto: > Marco Bodrato writes: >> For the style, I do not know which one is better: >> "if(c){v=val}else{v=0};" or "v=0;if(c){v=val};" > I don't have a very strong opinion on this point, but when it's

Re: [PATCH] mini-gmp: pass correct old_size to custom reallocate function

2020-03-09 Thread Marco Bodrato
Ciao, Il Lun, 9 Marzo 2020 2:08 pm, Niels Möller ha scritto: > Marco Bodrato writes: > >> -gmp_free_func (rden, 0); >> +gmp_free_func (rden, lden * sizeof (char)); > > sizeof (char) == 1 by definition, so no need to multiply with it. Of cours

Re: mpz_prevprime

2020-03-09 Thread Marco Bodrato
Ciao, Il 2020-03-09 11:59 Seth Troisi ha scritto: On Mon, Mar 9, 2020 at 2:05 AM Marco Bodrato wrote: Ciao, Il 2020-03-09 02:56 Seth Troisi ha scritto: It's highly possible I misunderstand how these macros are supposed to be used. I also dislike the boiler plate of the macros but I

Re: [PATCH] mini-gmp: pass correct old_size to custom reallocate function

2020-03-09 Thread Marco Bodrato
Ciao, Il 2020-03-07 21:27 minux ha scritto: All comments addressed, except as noted below. I also fixed similar issues in mini-mpq.c changes. On Sat, Mar 7, 2020 at 2:26 PM Niels Möller wrote: minux writes: I'm not that familiar with the mpq functions. I hope Marco can comment. I had

Re: mpz_prevprime

2020-03-09 Thread Marco Bodrato
Ciao, Il 2020-03-09 02:56 Seth Troisi ha scritto: From Marco Bodrato's analysis I got inspired and modified nextprime to Your analysis is much deeper than mine :-) I do not have much time now... but I glanced at the code and write here a few comments. You can see some of my thresholds and

Re: Overflow in mpz_cmp

2020-02-18 Thread Marco Bodrato
Ciao, Il 2020-02-18 00:35 Guillaume Melquiond ha scritto: Le 17/02/2020 à 22:59, Marco Bodrato a écrit : Did you apply formal verification to other functions? Did they succeed? Here is the list of verified functions: mpn_powm mpn_sqrtrem Those are particularly interesting for me

Re: Overflow in mpz_cmp

2020-02-17 Thread Marco Bodrato
Ciao, Il 2020-02-13 08:54 Guillaume Melquiond ha scritto: Note that mpz_cmpabs does not have any overflow issue, since absolute sizes are subtracted there. So, except maybe for homogeneity with mpz_cmp, it can keep using the optimized version. I pushed a patch for both cmp and cmpabs. So that

Re: mpz_prevprime

2020-02-12 Thread Marco Bodrato
Ciao, Il Mer, 12 Febbraio 2020 7:26 am, Niels Möller ha scritto: > Marco Bodrato writes: >> Maybe my estimates are wrong. If they are not, the limit for trial >> division should be increased more than linearly on the size of the > And current code uses > prime_limi

Re: mpz_prevprime

2020-02-11 Thread Marco Bodrato
Ciao, Il 2020-02-10 21:01 ni...@lysator.liu.se ha scritto: On my laptop, it gives a speedup of about 25% for larger sizes. Not sure how to tune for small sizes, but clearly the old code clamping the size of the prime table based on the bit size is better than doing nothing. The computation of

Re: Overflow in mpz_cmp

2020-02-11 Thread Marco Bodrato
Ciao, Il 2020-02-11 14:56 Marc Glisse ha scritto: On Tue, 11 Feb 2020, Niels Möller wrote: if (usize != vsize) return (usize > vsize) ? 1 : -1; On x86_64, both gcc and clang optimize (usize > vsize) ? 1 : -1 to 2 * (usize > vsize) - 1 (as a single LEA for gcc, 2 ADD for llvm). So the

Re: Overflow in mpz_cmp

2020-02-11 Thread Marco Bodrato
Ciao, Il 2020-02-11 14:42 ni...@lysator.liu.se ha scritto: Marco Bodrato writes: diff -r f5601c2a8b11 mpz/cmp.c --- a/mpz/cmp.c Sun Feb 09 16:16:19 2020 +0100 +++ b/mpz/cmp.c Tue Feb 11 14:20:39 2020 +0100 @@ -35,15 +35,15 @@ + cmp = (usize > vsize) - (usize < vsize); + if (cm

Re: Overflow in mpz_cmp

2020-02-11 Thread Marco Bodrato
Ciao, Il 2020-02-10 18:25 Guillaume Melquiond ha scritto: When the operand sizes do not match, the mpz_cmp function function just returns the difference of the signed sizes. Unfortunately, this difference might not fit inside the "int" return type, when the numbers are of opposite sign. In

Re: mpz_prevprime

2020-02-09 Thread Marco Bodrato
Ciao, Il 2020-01-30 21:12 Seth Troisi ha scritto: I've disabled both gap test but left the code is so that I can uncomment and run it when doing serious changes to the file. If the gap tests are basically equivalent to running many repetitions of test_ref, you can trigger it when the number

Re: mpz_prevprime

2020-02-05 Thread Marco Bodrato
Ciao, Il 2020-02-05 00:59 Seth Troisi ha scritto: I got VERY VERY VERY lucky and found a prime gap > 2^15 with the 4th highest merit of ANY KNOW PRIME GAP. Great! Given Marco's timing of 25 seconds (and a goal of say 3 seconds on that machine) the start prime would need to be ~~200 digits.

  1   2   3   4   5   >