Re: Mersenne: Simple question about P-1 factoring method

1999-08-17 Thread Peter-Lawrence . Montgomery

Will Edgingtom writes:

 If I understand P-1 factoring correctly, then using it to a stage one
 bound of k to try to factor M(p) will find all possible factors less
 than or equal to 2*k*p + 1.  I'm assuming that p is less than k (or p
 is always used in the powering) and the convention several of us
 agreed on a while back that all prime powers less than the stage one
 bound are used in the powering, not just the primes themselves.  That
 is, trying to factor M(97), say, to a stage one bound of 10 would use
 8, 9, 5, 7, and 97, not just 2, 3, 5, and 7.

Yes, p itself should be used in the powering.  
So should all prime powers below (or up to and including) k.

 Am I correct?  Or could a factor smaller than 2*k*p + 1 be missed in
 some cases?

In the last example a factor 16*97 + 1 could be missed.
Otherwise all factors below 2*k*p + 1 should be found.  
One extra squaring will achieve the 2*k*p + 1 bound.

   The power of the P-1 algorithm is that it can potentially find 
many larger factors, such as 252*p +1 using a stage 1 bound of 10.  
Observe 252 = 4 * 9 * 7 is a product of prime powers each = 10.
If you want to check only for prime factors below 2*k*p + 1,
P-1 is not the way.  [P-1 coupled with ECM might be the way if k is
large, although some uncertainty remains even after repeated ECM runs.]

 Does it matter whether p is prime or not?  I don't think so, but ...

Not if you always include an exponentiation by p, and repeat it
if necessary as you do primes = k.

Peter Montgomery


_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Simple question about P-1 factoring method

1999-08-17 Thread Will Edgington


[EMAIL PROTECTED] writes:

Am I correct?  Or could a factor smaller than 2*k*p + 1 be missed in
some cases?

   In the last example a factor 16*97 + 1 could be missed.
   Otherwise all factors below 2*k*p + 1 should be found.  
   One extra squaring will achieve the 2*k*p + 1 bound.

Gack.  Yes, I should have caught that myself; it's the same situation
as for p, isn't it?

  The power of the P-1 algorithm is that it can potentially find 
   many larger factors, such as 252*p +1 using a stage 1 bound of 10.  

Of course; I realize that.  I was only looking at it this odd way
because of the trial factoring gaps I need to close.  Since I already
have the P-1 data, it's easy to do this.  If I didn't already have the
P-1 data, it would (most likely) be faster to simply do the trial
factoring.

Further, it seems to me that doing trial factoring to extend from P-1
factoring doesn't make sense.  Note that trial factoring would have to
check 2*(k + 1)*p + 1 next; P-1 only has to do k + 1 next if it's a
prime or prime power (or p).  Trial factoring could use the knowledge
of P-1 being done thru a stage one of k by "sieving" the trial factors
based on one less than the trial factors as well as the usual sieving
of the trial factors themselves, but that's exactly the set that P-1
would test with larger stage one bounds, and P-1 would, as you point
out, find more factors with at most a little extra work.  Right?

I've heard that P-1 is "more efficient" than trial factoring; does the
proof go along these lines?  Or is it more complicated than this?

Of course, if this is correct, then I should fill the trial factoring
gaps using P-1, at least to the largest stage one the program that I
use supports.

Does it matter whether p is prime or not?  I don't think so, but ...

   Not if you always include an exponentiation by p, and repeat it
   if necessary as you do primes = k.

So a composite p should effectively be treated as if it were prime in
the powering even though it's prime factors are being used as well?
That certainly makes sense, given the extra power of 2 and of p used
because of the special form of Mersenne factors.

Thanks,

Will
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Simple question about P-1 factoring method

1999-08-17 Thread Lucas Wiman

 If I understand P-1 factoring correctly, then using it to a stage one
 bound of k to try to factor M(p) will find all possible factors less
 than or equal to 2*k*p + 1.

Yes, of the form n*p+1 (not 2*n*p+1 :).  This is for the simple reason
that every power of a prime =k must divide Q (due to your definition
of how Q is produced).  Then by the fundemental theorem of arithmetic,
(all numbers are able to be evenly divided into primes  themselves),
we know that any number k must be a product of powers of primes k,
and hence divide Q.

This is (unfortunatly) not useful for you, if your goal is to
deterministically find factors less than a certain limit.  P-1 would
be much slower than trial division if that is your goal.  P-1 is
useful for finding very large factors that would be missed by trial
division.

Oop, just got your other letter:

 I've heard that P-1 is "more efficient" than trial factoring; does the
 proof go along these lines?  Or is it more complicated than this?

It's only "sort of" more efficient than trial factoring.  It will find
(some) large factors more quickly than trial factoring will, but it
wouldn't find the factor 2^40*p+1, which (unless p is very small) would
be found very easily by trial factoring.

 Of course; I realize that.  I was only looking at it this odd way
 because of the trial factoring gaps I need to close.  Since I already
 have the P-1 data, it's easy to do this.  If I didn't already have the
 P-1 data, it would (most likely) be faster to simply do the trial
 factoring.

Why would you have trial factoring with such low bounds, but P-1 with
such high bounds?  Just asking...

-Lucas

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: RE: Factoring more

1999-08-17 Thread Lucas Wiman

 Numbers above   are factored to
 -   ---
 71002^72
 57022^71
 44152^70
 35102^69
 28132^68
 21592^67
 17852^66
 13382^65
 825 2^64

Isn't this the "optimal" configuration if all computers in GIMPS
were identical?

There are a number of 486's that are doing only factoring, does
it take these into account?  Think about it, there are a number of
computers which should be factoring numbers faster than LL
tests can be performed.  Call this "factoring profit."  Wouldn't
it make sense to keep factoring profit as low as possible, as this
could speed up the more immediate consern of sooner-to-be-performed
LL tests?

As an example, I just recieved the factoring assignment of 10258511,
but I should think that we wouldn't even start these tests for some time.
Why not go back and factor some 8M exponents further so as to save a
few current LL tests?  Wouldn't this actually get us to the next prime
quicker?

Or do I need to get more sleep :)?

-Lucas
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Alpha DS20 timings.

1999-08-17 Thread Simon Burge

Folks,

Compaq have a DS20 Alpha with 2 500MHz 21264 CPUs on the internet for
people to try out.  I've modified the mers package to that it can print
out iteration timing (patches coming soon Will!).  The iteration times
are fairly constant across 10 samples, so I've only listed one per
program/exponent.  The -C means don't dump a checkpoint at any time, and
-S N means print iteration times every N iterations.  The programs were
compiled with the DEC C compiler using "cc -fast -arch host -O4".  The
exponents were choosen just to demonstrate different FFT lengths.

Before anyone says anything, I know that the "iters/sec" should be
"secs/iter" :-)

% ./fftlucas -C -S 10 91
speed: 10 iters in  1.362 seconds, 0.136 iters/sec (fft len   64k)
% ./fftlucas -C -S 10 141
speed: 10 iters in  3.168 seconds, 0.317 iters/sec (fft len  128k)  
% ./fftlucas -C -S 10 291
speed: 10 iters in  7.124 seconds, 0.712 iters/sec (fft len  256k) 
% ./fftlucas -C -S 10 581
speed: 10 iters in 16.410 seconds, 1.641 iters/sec (fft len  512k)

% ./mersenne1 -C -S 10 91
speed: 10 iters in  0.832 seconds, 0.083 iters/sec (fft len   64k)
% ./mersenne1 -C -S 10 141
speed: 10 iters in  1.797 seconds, 0.180 iters/sec (fft len  128k)
% ./mersenne1 -C -S 10 291
speed: 10 iters in  4.257 seconds, 0.426 iters/sec (fft len  256k)
% ./mersenne1 -C -S 10 581
speed: 10 iters in  9.055 seconds, 0.905 iters/sec (fft len  512k)

% ./MacLucasUNIX -C -S 10 141
speed: 10 iters in  0.152 seconds, 0.015 iters/sec (fft len   64k)
% ./MacLucasUNIX -C -S 10 291
speed: 10 iters in  0.362 seconds, 0.036 iters/sec (fft len  128k)
% ./MacLucasUNIX -C -S 10 581
speed: 10 iters in  0.782 seconds, 0.078 iters/sec (fft len  256k)
% ./MacLucasUNIX -C -S 10 1161
speed: 10 iters in  1.777 seconds, 0.178 iters/sec (fft len  512k)
% ./MacLucasUNIX -C -S 10 2321
speed: 10 iters in  4.606 seconds, 0.461 iters/sec (fft len 1024k)
% ./MacLucasUNIX -C -S 10 4641
speed: 10 iters in 13.601 seconds, 1.360 iters/sec (fft len 2048k)

and for the 10^n digit fans:

% ./MacLucasUNIX -C -S 10 33219281
speed: 10 iters in  4.634 seconds, 0.463 iters/sec (fft len  1024k)
% ./MacLucasUNIX -C -S 3 332192831
speed: 3 iters in 65.950 seconds, 21.983 iters/sec (fft len 16384k)

The machine "only" had 1GB of RAM, and the 16M FFT took up about 675MB
of RAM.  I couldn't test any larger numbers :-)


For comparison, this is MacLucasUNIX on a 500MHz AlphaPC164 (21164 CPU)
compiled with "gcc -mcpu=21164a -Wa,-m21164a -O6":

speed: 10 iters in  0.331 seconds, 0.033 iters/sec (fft len   64k)
speed: 10 iters in  0.842 seconds, 0.084 iters/sec (fft len  128k)
speed: 10 iters in  1.918 seconds, 0.192 iters/sec (fft len  256k)
speed: 10 iters in  4.219 seconds, 0.422 iters/sec (fft len  512k)
speed: 10 iters in 10.531 seconds, 1.053 iters/sec (fft len 1024k)

GCC isn't the best compiler around with floating point, so these figures
might not be the best comparison between the 21164 and 21264.

Also for comparison, here's some figures for MacLucasUNIX on a 200MHz
UltraSparc with different FFT lengths:

speed: 10 iters in  0.530 seconds, 0.053 iters/sec (fft len   64k)
speed: 10 iters in  1.739 seconds, 0.174 iters/sec (fft len  128k)
speed: 10 iters in  3.737 seconds, 0.374 iters/sec (fft len  256k)
speed: 10 iters in  7.459 seconds, 0.746 iters/sec (fft len  512k)
speed: 10 iters in 16.261 seconds, 1.626 iters/sec (fft len 1024k)

Even dividing the iteration times by 2.5 (which assumes that memory
bandwidth scales equally well with the UltraSparcs), the Alpha 21264
comes up favorably.


Ernst - since nigel is no more, where can I get the latest f90 code?
I've got 2.5b, and it's giving me some errors:

no restart file found...looking for range file...
no range file found...switching to interactive mode.
   Enter p,n (set n=0 for default FFT length) 5100071,262144
   Enter 'y' to run a self-test, return for a full LL test y
p is prime...proceeding with Lucas-Lehmer test...
using an FFT length of  262144
this gives an average19.4552268981934  bits per digit
   M 5100071 Roundoff warning on iteration  10 maxerr =  0.47074861
FATAL ERROR...Halting execution.

Testing 3100079 (at about 11 bits per digit) gives the same errors, and
this happens with or without optimisation turned on.


Well, that's my benchmarking done for the day...

Simon.
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



RE: Mersenne: Alpha DS20 timings.

1999-08-17 Thread Aaron Blosser

 Compaq have a DS20 Alpha with 2 500MHz 21264 CPUs on the internet for
 people to try out.
snip
 and for the 10^n digit fans:

 % ./MacLucasUNIX -C -S 10 33219281
 speed: 10 iters in  4.634 seconds, 0.463 iters/sec (fft len  1024k)
 % ./MacLucasUNIX -C -S 3 332192831
 speed: 3 iters in 65.950 seconds, 21.983 iters/sec (fft len 16384k)

 The machine "only" had 1GB of RAM, and the 16M FFT took up about 675MB
 of RAM.  I couldn't test any larger numbers :-)

Just like I thought...the 21264 really is a monster.

Compaq had a few of their REALLY nice boxes at the engineer conference last
week in Toronto.  I desperately wanted to run some benchmarks, but something
told me that I shouldn't run software on these machines without asking
permission... hmmm..

Can't wait to get me one of them boxes!

Aaron

ps - Is anyone else besides me happy that Compaq is ditching that boring old
"computer beige" in favor of the "opal" (basically white) color for their
servers?  I think they look snazzy.

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Alpha DS20 timings.

1999-08-17 Thread Simon Burge

George Woltman wrote:

 Hi,
 
 At 01:37 PM 8/18/99 +1000, Simon Burge wrote:
 and for the 10^n digit fans:
 % ./MacLucasUNIX -C -S 10 33219281
 speed: 10 iters in  4.634 seconds, 0.463 iters/sec (fft len  1024k)
 
 You can't test M33219281 with a 1024k FFT, you'll need a 2048k fft :(

I guess MacLucasUNIX didn't pick up a error with a 1M FFT in the first
200 iterations.  Oh well, at 1.361 secs/iter with a 2M FFT, it'll take
roughly 523 days...

Do we have any Intel timings on a FFT this size?

Simon.
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Alpha DS20 timings.

1999-08-17 Thread George Woltman

Hi,

At 01:37 PM 8/18/99 +1000, Simon Burge wrote:
and for the 10^n digit fans:
% ./MacLucasUNIX -C -S 10 33219281
speed: 10 iters in  4.634 seconds, 0.463 iters/sec (fft len  1024k)

You can't test M33219281 with a 1024k FFT, you'll need a 2048k fft :(

George

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers