Ciao David,

Il Ven, 15 Giugno 2012 2:19 am, David Cleaver ha scritto:
> I would be happy to contribute code to GMP and/or the FSF.  Please let
> me know how best to accomplish this.

The first step is: produce a valuable piece of code; the next one is some
paperwork (copyright assignments to FSF), but I don't remember the
details.

> I'm not sure how fast people would expect the BPSW test to be.

I'd expect it to be comparable in speed with the current
mpz_probab_prime_p (tested_number, 5);
with five or six repetitions.

> I can't find the reference right now.  I have just run some timings on
> my SPRP test vs my BPSW test.

When you compare two algorithms, it makes sense to implement both (using
comparable skills and efforts) and compare them...
But, if you want to propose your code for insertion into GMP, you should
better compare your code with the code in the library.

> It looks like the BPSW test takes 9 times longer to run
> than the SPRP test.  I think we should be able to reduce this quite
> a bit, since I implemented it all with mpz's and I did not necessarily
> use the fastest algorithms in my implementation.

I wrote some code for the strong Lucas primality test, with Selfridge
parameters. It takes 3-5 times the time of a Miller-Rabin iteration. It
should be optimised, using Montgomery reduction, and possibly using some
cleverer formulas. (you can also use it for your code, as any other piece
of code already assigned to FSF)

> Please let me know what you think of the above and if you are still
> interested in having a BPSW test integrated into GMP.

I am. If we obtain an efficient code, I'm confident that the other
developers will accept the insertion.


But there still are some questions.

After trial division, BPSW requires a base-2 Miller-Rabin test, i.e.
computing some powers of 2 modulo N, the exponent being
(N-1)/2^some_thing.
The formulas I've found to compute Lucas numbers use some powers of Q
modulo N, the exponent being (N+1)/2^some_exp.
(N+1) and (N-1) typically share most of the bits, Q=2 when we choose D=-7
and we can choose it when jacobi(-7/N)=-1, i.e. roughly half of the times.
It would be nice to mix the two tests in one, whenever we can. Do you
think we can use D=-7 even if jacobi(5/N)=-1 and we should use D=5
according to the rule "the first in the list 5, -7, 9, -11, 13,..."?


The attached code shows the times of Lucas test compared to base-2
Miller-Rabin, for Mersenne numbers (where Miller-Rabin gives many false
positives, and Lucas-Lehmer is used by my code) and factorials+1 (where
the search for a small D such that jacobi(D/N)=-1 is longer).

Best regards,
Marco

-- 
http://bodrato.it/
/* Primality testing.

Copyright 1991, 1993, 1994, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2005 Free
Software Foundation, Inc.  Miller-Rabin code contributed by John Amanatides.

Copyright 2012, Marco Bodrato

This file is not yet part of anything, but intended for the GNU MP Library.

This file is free software; you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation; either version 3 of the License, or (at your
option) any later version.

This file is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Affero General Public
License for more details.

You should have received a copy of the GNU Affero General Public License
along with this file.  If not, see http://www.gnu.org/licenses/.
*/

/* 
  Testing progam. It needs GMP sources to be compiled, after make&&(cd tune;make speed).
  Compile & test with something like:
  gcc -DMAIN -Itune -I. -O2 primetest.c .libs/libgmp.a tune/.libs/libspeed.a -o pt&&./pt
*/

#include "gmp.h"
#include "gmp-impl.h"
#include "longlong.h"

#ifdef MAIN
#include "speed.h"

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#else /* MAIN */

#endif /* MAIN */

/*********************************************************/
/* Section Lucas: Strong Lucas, for BPSW.                */
/*********************************************************/

/* int mpz_stronglucas_selfridgelehmer_p (mpz)

 Perform a strong Lucas primality test on the argument, using the
 parameters suggested by Selfridge. If the program detects a Mersenne
 number, it performs the Lucas-Lehmer primality test, instead.

 Returns 2 on Mersenne primes.
 Returns 1 on odd positive primes and odd positive pseudoprimes.
 Returns 0 otherwise (returned also for 2 or negative primes).

 So: 0 means "< 3 or composite", 1 means "probably prime", 2 means "prime".

 FIXME: Before Lucas-Lehmer, we should probably distinguish the prime
 exponent from pseudo-prime ones.

 IMPROVEMENTS:
 Most of the times Q is -1 or a power of 2, shall we optimise the two
 mul_ui involving q?

 Half of the (relevant) times we have jacobi(-7/n) = -1. In this case
 Q = 2 and we compute some big powers of 2 (mod N). They should be
 recycled for the base-2 Miller-Rabin test required by the BPSW test.

 In general, we compute some big powers of Q, if it's not 1, we should
 use it for a further Miller-Rabin test.

 Use redc to speed-up modular reductions.
*/

int
mpz_stronglucas_selfridgelehmer_p (mpz_srcptr n)
{
  int ret = 0;

  /* FIXME: Returns 0 on 2 and negatives... */
  if (mpz_odd_p (n) && (mpz_cmp_ui (n, 1) > 0) && ! mpz_perfect_square_p (n)) {
    /* Worst case for this kind of test is n=pq, with p+2=q twin primes...
       we can avoid it: if mpz_perfect_square_p(n+1), n=(x+1)(x-1).
       see Arnault, F. "The Rabin-Monier Theorem for Lucas Pseudoprimes."	*/

    mpz_t qt, vt, tt;
    int q, sign;
    mp_bitcnt_t s, i;

    /* Find t and s, where t is odd and n + 1 = 2**s * t . */
    s = mpz_scan0 (n, 0);	/* n is odd, the +1 carry propagates up to the first 0. */
    ASSERT (s > 0);
    i = mpz_sizeinbase (n, 2);
    ASSERT (i == s || i > s + 1);

    mpz_init (tt);
    mpz_init (qt);
    mpz_init (vt);

    if (LIKELY (i > s)) {
      mpz_t ut;

      {
	sign = 4;
	/* D = P**2 - 4Q, P=1.

	   We have to choose D in the sequence 5, -7, 9, -11, ...
	   i.e. Q in the sequence -1, 2, -2... so that jacobi(D/n) = -1.

	   Should we skip 9 and squares? Should we detect D overflow?
	*/
	q = 1;
	do {
	  q += (sign < 0);
	  mpz_set_si (tt, 1 + sign * q);
	  sign = -sign;
	} while (mpz_jacobi(tt, n) != -1);
	sign /= 4; /* Q=sign*q, D=1-4*Q */
      }
 
      mpz_set_si (qt, sign*q);
      mpz_set_ui (vt, 1);

      i--;
      mpz_init_set_ui (ut, 1);
      /* Steps to generate Lucas(P,Q) sequences (from http://en.wikipedia.org/wiki/Lucas_sequence)
	 Single step:
	 U_{n+1} = (P*U_n+V_n)/2
	 V_{n+1} = ((P^2-4Q)U_n+P*V_n)/2
	 with P=1
	 U_{n+1} = (U_n+V_n)/2
	 V_{n+1} = ((1-4Q)U_n+V_n)/2 = (U_n+V_n)/2 - 2Q*U_n

	 Doubling step:
	 U_2n = U_n V_n
	 V_2n = V_n^2 -2Q^n
      */
      do {
	mpz_mul (tt, ut, vt);	/* U_2n = U_n V_n */
	mpz_mod (ut, tt, n);
	mpz_mul (tt, vt, vt);	/* V_n^2 - 2 Q^n*/
	mpz_submul_ui (tt, qt, 2);
	mpz_mod (vt, tt, n);
	mpz_mul (tt, qt, qt);	/* Q^2n = (Q^n)^2*/
	mpz_mod (qt, tt, n);
	i --;
	if (i == s || mpz_tstbit (n, i)) {
	  /* U_{n+1} = (U_n+V_n)/2 */
	  mpz_add (tt, ut, vt); 	/* ut < n, vt < n  =>  ut + vt < 2n */
	  if (mpz_odd_p (tt))		/* We can not divide by 2 ... */
	    if (mpz_cmp (tt, n) < 0)	/* n is odd, tt +- n is even! */
	      mpz_add (tt, tt, n);	/* tt < n  =>  tt + n < 2n    */
	    else
	      mpz_sub (tt, tt, n);	/* n <= tt < 2n  =>  0 <= tt - n < n*/
	  mpz_tdiv_q_2exp (vt, tt, 1);	/* tt < 2n  =>  vt < n */
	  mpz_mul_ui (tt, ut, 2*q);	/* 2Q*U_n */
	  mpz_mul_ui (ut, qt, q);	/* Q^(n+1) = (Q^n)*Q */
	  if (sign > 0)
	    mpz_neg (tt, tt);
	  else
	    mpz_neg (ut, ut);
	  mpz_mod (qt, ut, n);
	  /* V_{n+1} = U_{n+1} - 2Q*U_n */
	  mpz_add (tt, tt, vt);
	  mpz_mod (ut, tt, n);
	  mpz_swap (ut, vt);	/* We computed U and V in the wrong places... */
	}
      } while (i != s);

      /* Check if n|U_t or n|V_t*/
      if (mpz_sgn (ut) == 0 || mpz_sgn (vt) == 0)
	ret = ! (s = 0);
      mpz_clear (ut);
    } else { /* Otherwise i == s i.e. n = 2**s - 1. Mersenne number. */
      mpz_t exponent;
      mp_limb_t limbs[LIMBS_PER_ULONG];

      MPZ_FAKE_UI (exponent, limbs, s);
      /* FIXME: Is a probably prime exponent enough for the correctness of Lucas-Lehmer? */
      if (UNLIKELY (mpz_probab_prime_p (exponent, 0))) /* A composite exponent implies composite. */
	if (s < 22) {
	  ret = s != 11;
	} else { /* Lucas-Lehmer */
	  mpz_set_ui (vt, 37634); /* vt = ((4^2-2)^2-2)^2-2 */
	  s -= 2 + 3;
	  do {
	    mpz_mul (tt, vt, vt);
	    mpz_sub_ui (tt, tt, 2); /* tt = (vt*vt) - 2 */
	    mpz_mod (vt, tt, n);    /* IMPROVE: exploit the special form  2**s - 1 of the modulus. */
	  } while (--s);
	  ret = (mpz_sgn (vt) == 0);
	}
      s = 0;     /* No further analysis is needed, for Mersenne numbers. */
      ret <<= 1; /* The answer from Lucas-Lehmer primality test is definitive. */
    }

    while (s) { /* Compute V_t*2^i... */
      mpz_mul (tt, vt, vt);	/* V_n^2 - 2 Q^n*/
      mpz_submul_ui (tt, qt, 2);
      mpz_mod (vt, tt, n);
      if (mpz_sgn (vt) == 0) {
	ret = 1;
	break;
      }
      s--;
      if (s) {
	mpz_mul (tt, qt, qt);
	mpz_mod (qt, tt, n);
      }
    }
    mpz_clear (qt);
    mpz_clear (vt);
    mpz_clear (tt);
  }

  return ret;
}

#ifdef MAIN

/*********************************************************/
/* Section Miller-Rabin: Fixed, base-2 test, for BPSW.   */
/* This code was taken from GMP sources, adapted here    */
/* for speed testing purposes.                           */
/*********************************************************/

static int millerrabin (mpz_srcptr, mpz_srcptr,
			mpz_ptr, mpz_ptr,
			mpz_srcptr, unsigned long int);

int
mpz_millerrabin2 (mpz_srcptr n)
{
  int r;
  mpz_t nm1, x, y, q;
  unsigned long int k;
  gmp_randstate_t rstate;
  int is_prime;
  TMP_DECL;
  TMP_MARK;

  MPZ_TMP_INIT (nm1, SIZ (n) + 1);
  mpz_sub_ui (nm1, n, 1L);

  MPZ_TMP_INIT (x, SIZ (n) + 1);
  MPZ_TMP_INIT (y, 2 * SIZ (n)); /* mpz_powm_ui needs excessive memory!!! */

  MPZ_TMP_INIT (q, SIZ (n));

  /* Find q and k, where q is odd and n = 1 + 2**k * q.  */
  k = mpz_scan1 (nm1, 0L);
  mpz_tdiv_q_2exp (q, nm1, k);

  /* 2 to n-2 inclusive, don't want 1, 0 or -1 */
  mpz_set_ui (x, 2L);

  is_prime = millerrabin (n, nm1, x, y, q, k);

  TMP_FREE;
  return is_prime;
}

static int
millerrabin (mpz_srcptr n, mpz_srcptr nm1, mpz_ptr x, mpz_ptr y,
	     mpz_srcptr q, unsigned long int k)
{
  unsigned long int i;

  mpz_powm (y, x, q, n);

  if (mpz_cmp_ui (y, 1L) == 0 || mpz_cmp (y, nm1) == 0)
    return 1;

  for (i = 1; i < k; i++)
    {
      mpz_powm_ui (y, y, 2L, n);
      if (mpz_cmp (y, nm1) == 0)
	return 1;
      if (mpz_cmp_ui (y, 1L) == 0)
	return 0;
    }
  return 0;
}

/*********************************************************/
/* Section pprime, with fixed Miller-Rabin base-2 test.  */
/* This code was taken from GMP sources, adapted here    */
/* for speed testing purposes.                           */
/*********************************************************/

static int isprime (unsigned long int);

/* MPN_MOD_OR_MODEXACT_1_ODD can be used instead of mpn_mod_1 for the trial
   division.  It gives a result which is not the actual remainder r but a
   value congruent to r*2^n mod d.  Since all the primes being tested are
   odd, r*2^n mod p will be 0 if and only if r mod p is 0.  */

int
mpz_probab_prime (mpz_srcptr n)
{
  mp_limb_t r;
  mpz_t n2;

  /* Handle small and negative n.  */
  if (mpz_cmp_ui (n, 1000000L) <= 0)
    {
      int is_prime;
      if (mpz_cmpabs_ui (n, 1000000L) <= 0)
	{
	  is_prime = isprime (mpz_get_ui (n));
	  return is_prime ? 2 : 0;
	}
      /* Negative number.  Negate and fall out.  */
      PTR(n2) = PTR(n);
      SIZ(n2) = -SIZ(n);
      n = n2;
    }

  /* If n is now even, it is not a prime.  */
  if ((mpz_get_ui (n) & 1) == 0)
    return 0;

#if defined (PP)
  /* Check if n has small factors.  */
#if defined (PP_INVERTED)
  r = MPN_MOD_OR_PREINV_MOD_1 (PTR(n), (mp_size_t) SIZ(n), (mp_limb_t) PP,
			       (mp_limb_t) PP_INVERTED);
#else
  r = mpn_mod_1 (PTR(n), (mp_size_t) SIZ(n), (mp_limb_t) PP);
#endif
  if (r % 3 == 0
#if GMP_LIMB_BITS >= 4
      || r % 5 == 0
#endif
#if GMP_LIMB_BITS >= 8
      || r % 7 == 0
#endif
#if GMP_LIMB_BITS >= 16
      || r % 11 == 0 || r % 13 == 0
#endif
#if GMP_LIMB_BITS >= 32
      || r % 17 == 0 || r % 19 == 0 || r % 23 == 0 || r % 29 == 0
#endif
#if GMP_LIMB_BITS >= 64
      || r % 31 == 0 || r % 37 == 0 || r % 41 == 0 || r % 43 == 0
      || r % 47 == 0 || r % 53 == 0
#endif
      )
    {
      return 0;
    }
#endif /* PP */

  /* Do more dividing.  We collect small primes, using umul_ppmm, until we
     overflow a single limb.  We divide our number by the small primes product,
     and look for factors in the remainder.  */
  {
    unsigned long int ln2;
    unsigned long int q;
    mp_limb_t p1, p0, p;
    unsigned int primes[15];
    int nprimes;

    nprimes = 0;
    p = 1;
    ln2 = mpz_sizeinbase (n, 2);	/* FIXME: tune this limit */
    for (q = PP_FIRST_OMITTED; q < ln2; q += 2)
      {
	if (isprime (q))
	  {
	    umul_ppmm (p1, p0, p, q);
	    if (p1 != 0)
	      {
		r = MPN_MOD_OR_MODEXACT_1_ODD (PTR(n), (mp_size_t) SIZ(n), p);
		while (--nprimes >= 0)
		  if (r % primes[nprimes] == 0)
		    {
		      ASSERT_ALWAYS (mpn_mod_1 (PTR(n), (mp_size_t) SIZ(n), (mp_limb_t) primes[nprimes]) == 0);
		      return 0;
		    }
		p = q;
		nprimes = 0;
	      }
	    else
	      {
		p = p0;
	      }
	    primes[nprimes++] = q;
	  }
      }
  }

  /* Perform a number of Miller-Rabin tests.  */
  return mpz_millerrabin2 (n);
}

static int
isprime (unsigned long int t)
{
  unsigned long int q, r, d;

  if (t < 3 || (t & 1) == 0)
    return t == 2;

  for (d = 3, r = 1; r != 0; d += 2)
    {
      q = t / d;
      r = t - q * d;
      if (q < d)
	return 1;
    }
  return 0;
}

/*********************************************************/
/* Section main: testing/timing helpers, not for gmplib! */
/*********************************************************/

typedef unsigned long long cycles_t;

#define HUGE_TIME 9999999999999ull
#define ITERATIONS 4

#define CYCLES(res, code)			\
do {						\
  unsigned long long __t;			\
  unsigned __p0[2], __p1[2];			\
  int __i;					\
  (res) = HUGE_TIME;				\
  for (__i = ITERATIONS + 1; --__i; ) {		\
    speed_cyclecounter (__p0);			\
    code;					\
    speed_cyclecounter (__p1);			\
    __t = (0x100000000ull * __p1[1] + __p1[0])-	\
	  (0x100000000ull * __p0[1] + __p0[0]);	\
    (res) = MIN (res, __t);			\
  };						\
} while (0)

void
time_primetest_facp1 (unsigned long l, unsigned long s)
{
  mpz_t num;
  unsigned long i;
  cycles_t cycles, cycles2;
  char c;

  mpz_init (num);

  for (i = 0; i <= l; i += s)
    {
      mpz_fac_ui (num, i);
      mpz_add_ui(num, num, 1);
      c = mpz_probab_prime (num) ? '#' : ' ';
      CYCLES (cycles, mpz_probab_prime (num));
      printf("%5lu !+1 %11llu%c",i,cycles,c);
      c = mpz_millerrabin2 (num) ? '#' : ' ';
      CYCLES (cycles, mpz_millerrabin2 (num));
      printf(" %11llu%c",cycles,c);
      c = mpz_stronglucas_selfridgelehmer_p (num) ? '#' : ' ';
      CYCLES (cycles2, mpz_stronglucas_selfridgelehmer_p (num));
      printf(" %11llu%c",cycles2,c);
      c = mpz_probab_prime_p (num, 13) ? '#' : ' ';
      printf(" %f%c\n",(float) cycles2 / cycles, c);
    }
}

void
time_primetest_mersenne (unsigned long l, unsigned long s)
{
  mpz_t num;
  unsigned long i;
  cycles_t cycles, cycles2;
  char c;

  mpz_init (num);

  for (i = 1; i <= l; i += s)
    {
      mpz_ui_pow_ui (num, 2, i);
      mpz_sub_ui(num, num, 1);
      c = mpz_probab_prime (num) ? '#' : ' ';
      CYCLES (cycles, mpz_probab_prime (num));
      printf("2^%4lu -1 %11llu%c",i,cycles,c);
      c = mpz_millerrabin2 (num) ? '#' : ' ';
      CYCLES (cycles, mpz_millerrabin2 (num));
      printf(" %11llu%c",cycles,c);
      c = mpz_stronglucas_selfridgelehmer_p (num) ? '#' : ' ';
      CYCLES (cycles2, mpz_stronglucas_selfridgelehmer_p (num));
      printf(" %11llu%c",cycles2,c);
      c = mpz_probab_prime_p (num, 13) ? '#' : ' ';
      printf(" %f%c\n",(float) cycles2 / cycles, c);
    }
}

int
main (int argc, char ** argv)
{
  unsigned long l, s;
  int t, p;

  printf("          Times computed in cycles. # means 'possibly prime'.\n");
  printf("number    probab_prime Miller-Rabin Lucas-Se-Le  Ratio(Lucas/Miller)\n");

  time_primetest_mersenne (1300, 2);

  l =430;
  s = 7;

  time_primetest_facp1 (l, s);

  return 0;
}

#endif
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to