Mersenne Digest Saturday, May 8 1999 Volume 01 : Number 553
----------------------------------------------------------------------
Date: Wed, 05 May 1999 23:47:19 -0500
From: Ken Kriesel <[EMAIL PROTECTED]>
Subject: Mersenne: volunteers & nominees for the QA effort
Here is the list to date of volunteers. If I missed anyone, or understated
or overstated your areas of involvement, please reply offline to me,
and I'll update the other participants in a single summary message.
George Woltman <[EMAIL PROTECTED]> is in it by default, as
a code developer and tester (presumably on Win95 and NT)
Any additions George?
George nominated (drafted?) the next two:
Richard McDonald <[EMAIL PROTECTED]>
PPC-based MAC client software (MacGIMPS) developer
"Ernst W. Mayer" <[EMAIL PROTECTED]>
DEC Alpha & SGI client software (lucas_mayer) developer
In somewhat random order are those who volunteered themselves:
Guillermo Ballester Valor <[EMAIL PROTECTED]>
linux win95
pentium166mmx
QAtest
Basic, Fortran, assembler (for I86X), C and C++
"Brian J Beesley" <[EMAIL PROTECTED]>
Run QA tests - on Intel Pentium II under Windows 95 & NT WS 4.0
- on Alpha 21164 under Red Hat Linux 5.1
- maybe on Intel Pentium II under Linux, later
Help design QA tests
Help keep testing databases & provide statistics. Can provide
filestore to assist with this.
Will assist in any coding which may be necessary
I'm less happy about reviewing code, but I'll give it a bash if necessary
Shane Sanford <[EMAIL PROTECTED]>
QAtest & C/C++ code
MS Visual C/C++ 4.0,5.0, & 6.0; P2 450 196 meg ram
access to Win95a/Win95b/Win98/NT 4.0
Ken Kriesel <[EMAIL PROTECTED]>
NT4 and win95 now, linux eventually
assorted systems including 486-66, pentium, dual-pentium, dual-pentiumpro,
pentiumII-400
QAtest
generating full LLtests under prime95 v14.4, v16.3, v18.1, and others.
Nick Craig-Wood <[EMAIL PROTECTED]>
http://www.axis.demon.co.uk/
suggests his ARMprime program as another means of independent-architecture
and program test results
"Jean-Yves Canart" <[EMAIL PROTECTED]>
QAtest
PentiumII-266 with unspecified OS
So to reach the QAtesters, use this list:
<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<BJ.Beesley
@ulst.ac.uk>,<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>
And for the code reviewers or developers:
<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<rmc
[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>
Ken
Ken Kriesel
(resident of Madison WI, last year's #1 city and this year's #7)
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Thu, 6 May 1999 01:11:25 -0700
From: Paul Leyland <[EMAIL PROTECTED]>
Subject: RE: Mersenne: ECM question
> At Paul Zimmerman's ECM page,
> http://www.loria.fr/~zimmerma/records/ecmnet.html
> the optimal B1 value listed for finding 50-digit factors is
> 43000000, but
> George's ECM factoring page uses 44000000 for the same
> purpose. Is one of
> them wrong, or is there a reason for the difference?
No, neither is "wrong", for at least two reasons.
First, ECM is a probabalistic algorithm. Each run chooses a random elliptic
curve and has a certain chance to find a factor of a particular size. When
enough curves have been run, there is particular probability of finding a
factor of that size, assuming that one exists. If one choose 50% as the
desired probability, the number of curves required will obviously be fewer
than if one chooses 60%, say. A similar choice can be made for trading off
B1 value against probability, as long as the trade isn't pushed too far.
Another reason is that the B1 value is only one quantity of importance.
Even if the probability mentioned above is fixed, the optimal number of
curves depends on the value of B2. Different implementations of ECM (or
even different runs of the same implementation) are free to choose different
values of B2 for a given B1.
A non-reason, but still of interest, is that the maximum in the probability
agains B1 curve is really rather flat, and it doesn't matter too much if
parameters are chosen which are not strictly optimum.
Paul
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Thu, 6 May 1999 14:13:35 +0000
From: "Steinar H . Gunderson" <[EMAIL PROTECTED]>
Subject: Mersenne: Re: Mersenne Digest V1 #552 (IA-64)
On Wed, May 05, 1999 at 05:54:50PM -0700, Mersenne Digest wrote:
>>From the IA-64 register set figure in the advert, one weakness appears to me
>to be the sheer amount of silicon: Intel is going from just 8 FP registers
>in all the Pentium incarnations to a whopping 128, each still having the
>x86's full 80 bits, in the IA-64 There are also 128 65-bit general purpose
>(64-bit integer plus carry bit) registers.
I've heard the main problem is getting efficient code out of it. Merced (IA-64)
is rumoured (or is it official?) to have 7 different execution streams. Writing
code to use all of these efficiently will be very difficult, if possible at all.
However, I think it's great that Intel finally gets some more registers than
the 8 (actually 7, you need at least a stack pointer) they've been using. 128
might just be a bit too much, though.
>2) What use are so many registers without a major increase in the number
>of functional units? (I believe IA-64 still has 1 FP adder, 1FP multiplier,
>perhaps 2 64-bit integer units, the latter having been a longstanding feature
>of e.g. Alpha and MIPS, which both have excellent integer functionality
>and neither has 128 integer registers.)
The point of registers is not only to be able to make operations on them
simultanously (hmmm, hope i got thawt right), but also being able to hold
things in them. If you've been doing x86 assembly, you will find yourself
having to push data in and out of memory all the time, because 8 registers
simply aren't enough for most algorithms. Hopefully, this will reduce the
amount of data going into and from memory (and cache).
I would love SIMD functions on this, however. Imagine splitting the registers
in 16 parts and doing `16-way' SIMD operations...
>Sure, lots of registers is nice for out-of-order execution (OOE), but even
>an OOE monster like the Alpha 21264 has "just" 80 FP registers, each of just
>64 bits - IA-64 will have double the silicon here!
Agreed. But nobody will accept going _down_ in FP precision.
- ---snip---
>> b1) Using "properly sized" FFT: Yes, I believe using properly sized FFT's
>> saves some time. How much depends on the programmer, the data, the
>> language/compiler, and the hardware. Non-power-of-2 FFT's can even be
>> useful, but it seems that coding up all (or most) of the possible sizes may
>> be more trouble than it's worth.
>Well, George went to the trouble of adding 3*2^n, 5*2^n and 7*2^n
>run length code into Prime95 - even though it's less efficient than
>plain powers-of-two, it's still well worth while, for the exponents that
>can benefit from it.
I think we're mixing up things here. One is about non-power-of-two FFT
runlengths. The thing I believe he originally asked about was changing the
FFT runlength dynamically, as the number increased in size.
>If it _does_ start to repeat "early" then 2^p-1
>_must_ be composite - because it can't get to zero without sticking
>at 2, see above - but is it really worth adding the extra code to
>check?
Probably not, as you have to keep everything in memory (or disk). At (say) a
256K FFT, you would need 1 MB just for four iterations, and and awful lot
of comparison.
/* Steinar */
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Tue, 4 May 1999 22:51:16 +0200
From: "Steinar H. Gunderson" <[EMAIL PROTECTED]>
Subject: Mersenne: Re: Mersenne Digest V1 #551
Second reply to the same message...
On Mon, May 03, 1999 at 11:22:02AM -0700, Mersenne Digest wrote:
>Date: Fri, 30 Apr 1999 10:44:16 -0600
>From: "Blosser, Jeremy" <[EMAIL PROTECTED]>
>Subject: Mersenne: JavaPrime update... (Lotsa Questions)
>
>b) Would it be more optimal to use a properly sized FFT to do the
>calculation than to use the 256k or 512k FFT. For example, a 64k FFT is
>subsecond, so is it feasable to initialize several different FFT engines for
>properly sized numbers.
I doubt so. Considering that the number is small only perhaps the 100 first
iterations (or so), and is at its largest at the last millions, it will only
be of extra complexity. Remember, the number is never getting larger than
the Mersenne prime itself.
>c) At some point, doing the math via FFT will become more expensive that a
>disk read (Especially if we plan on doing the 1,000,000,000 digit prime). So
>wouldn't a DB type lookup be more efficient at that point?
That would require an enormous lot of space! Consider just a 1K FFT:
2^1024 entries of 1 kilobyte each gives:
179769313486231590772930519078902473361797697894230657273430081157732
675805500963132708477322407536021120113879871393357658789768814416622
492847430639474124377767893424865485276302219601246094119453082952085
005768838150682342462881473913110540827237163350510684586298239947245
938479716304835356329624224137216
different combinations. I don't think we will see harddisks of this size
before the sun goes out, I'm afraid.
Or perhaps I've really missed your point ;-)
>e) Can N=N^2-2 be represented as a series? If so, wouldn't that be faster?
Keep in mind the modulo...
/* Steinar */
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Thu, 6 May 1999 19:21:25 +0200 (MET DST)
From: [EMAIL PROTECTED]
Subject: Re: Mersenne: ECM question
"Foghorn Leghorn" <[EMAIL PROTECTED]> observes:
> At Paul Zimmerman's ECM page,
> http://www.loria.fr/~zimmerma/records/ecmnet.html
> the optimal B1 value listed for finding 50-digit factors is 43000000, but
> George's ECM factoring page uses 44000000 for the same purpose. Is one of
> them wrong, or is there a reason for the difference?
If I count the zeros right, these are 43 million and 44 million.
The function being minimized, namely
probability of finding a 50-digit factor on one curve, given that such exists
- -----------------------------------------------------------------------------
time per curve
is flat near its minimum. Implementation and platform differences
can obviously affect the denominator (time per curve).
The stage-2 strategy affects the numerator. The two optimal B1's
are close enough to be considered the same.
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Thu, 6 May 1999 18:10:14 -0400
From: "Ernst W. Mayer" <[EMAIL PROTECTED]>
Subject: Mersenne: Mlucas for DEC Alpha / Linux
Helmut Zeisel ([EMAIL PROTECTED]) writes:
>Can you please give me some information about
>your Mersenne code for a DEC Alpha running Linux?
The source is at ftp://nigel.mae.cwru.edu/pub/mayer/Mlucas_2.6.f90.gz;
the README file in the same directory tells how to use it.
>As I understand, it is Fortran 90.
>Do I need a Fortran 90 compiler
>to use it on a DEC Alpha chip under Linux
>or do you also have a binary distribution?
Thanks to my SPEC98 contacts at DEC, er, Compaq, as of today I have a
Linux executable, compiled using a pre-beta version of their soon-to-be-
released F90 compiler for Alpha/Linux. This is in
ftp://nigel.mae.cwru.edu/pub/mayer/bin/ALPHA_LINUX/Mlucas_2_6X.gem4494u_linux.exe.gz;
once you have it, use gunzip or gzip -d to uncompress it. Note that it's
a pretty big file (618KB gzipped, 2MB uncompressed) because it consists
of the basic executable bundled together with all the run-time library (RTL)
files it needs to run. Please make sure to run the self-tests described in
the README file.
At this point, I must insert a word from my sponsors:
* * * * * *
Compaq Computer Corporation has said that it will release a Fortran
compiler for Alpha Linux during 1999. (See
http://www.compaq.com/newsroom/pr/1999/pr060499a.html for the press
release.) MLucas V2.6 has been compiled with a pre-beta version of this
compiler. The executable is available for users of Alpha Linux systems,
with the following conditions:
1) This text must be included with all distributions of the
executable and prominently displayed at download locations.
2) Please understand that the compiler is pre-beta. While we do not
know of any bugs that would affect the correct execution of
MLucas, surprises are certainly possible.
3) If you do encounter any surprises, please send a complete
description of the circumstances to the author of MLucas,
Ernst Mayer, [EMAIL PROTECTED] , including information
about your Linux environment (e.g. which software distribution)
and information about your hardware (exact model number as
printed on the back of the case would be very helpful).
4) Usage must be summarized to the author. Whether you have any
surprises or not, please drop a line to [EMAIL PROTECTED]
to let us know how things work out.
* * * * * *
I have only Unix on my Alphas, so have been unable to do timings myself,
but the folks at DEC, er, Compaq assure me it's about the same speed as
the Unix version compiled with the native compiler.
>Do you have any numbers how fast it runs
>compared to George Woltran's MPrime on
>an Intel chip?
Under DEC Unix, it's 1.5-2 times slower than George's Prime95
at the same clock speed. On SGI (for which executables are also available),
it's about equal at the same clock speed.
>On what ranges of exponents does it work?
The same as Prime95, i.e. up to 20 million or so. It also uses similar
FFT lengths (i.e. of form 2^n, 3*2^n, 5*2^n, 7*2^n) for the various exponent
ranges, although Prime95 can go about 1% higher for a given FFT length due
to the extra accuracy of the Intel 80-bit register type.
Happy hunting,
Ernst
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Fri, 7 May 1999 08:15:46 -0700
From: Paul Leyland <[EMAIL PROTECTED]>
Subject: RE: Mersenne: ECM question
> The function being minimized, namely
>
> probability of finding a 50-digit factor on one curve
> -----------------------------------------------------
> time per curve
>
> is flat near its minimum. Implementation and platform differences
> can obviously affect the denominator (time per curve).
> The stage-2 strategy affects the numerator. The two optimal B1's
> are close enough to be considered the same.
Umm. I think you want to maximize the probability.
Minimizing it is easy.
Paul
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Fri, 07 May 1999 20:54:01 -0400
From: George Woltman <[EMAIL PROTECTED]>
Subject: Mersenne: Any statistics majors out there?
Hi all,
I'm working on version 19 of prime95 and I need your help.
In the past, the exponents at which a larger FFT is used was picked
rather haphazardly. I simply picked a few exponents trying to find
one that could run a thousand or so iterations without the convolution
error greatly exceeding 0.3.
For this release, I thought it would be nice to use a more
scientific approach. What I've done is run almost 10000 iterations
of an exponent and used Excel to graph the number of iterations with
an error between 0.200 and 0.201, 0.201 and 0.202, etc. As you can imagine
you get a Bell-like curve that peaks at some value.
Since it has been 25 years since I took statistics courses
in college, I am at a loss as to how to use this graph to predict the
probability that N iterations will have an error below M. I could
then use such information to find the exponent such that the chance
of a convolution error exceeding 0.5 is less than 1 in 10^40 (or some
other criteria that we agree is "safe").
Rather than mail these largish spreadsheets to the entire list,
two sample graphs are at ftp://entropia.com/gimps/x1345000.xls and
ftp://entropia.com/gimps/x1360000.xls (in the graphs a 400 on the X-axis
is really a convolution error of 0.400, the Y-axis is a count of iterations).
Any help is appreciated.
Best regards,
George
P.S. A brief explanation of convolution errors: Since the FFT in prime95
is done in floating point arithmetic, round-off errors build up. After the
FFT, every value is rounded back to an integer. The convolution error is
the largest amount that a result value differs from the correct integer
value. If the convolution error exceeds 0.5, then the round-back-to-integer
step will round to the wrong integer value - a catastrophic failure.
Another way to look at this: Each FFT value contains about 20 bits of data.
That is, values ranging from 2^19 to -2^19. Consider an FFT of 128K elements.
If each FFT value was close to the 2^19 value, then each result value would
need 19+19+17=55 bits to hold the result. Since a float only holds 53 bits
there would be a loss of information. Fortunately, probability assures us
that it is extremely unlikely that every (or even most) FFT values will
be close to the 2^19 maximum value.
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Fri, 7 May 1999 20:24:21 -0700
From: Todd Sauke <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Any statistics majors out there?
George,
You indicate that that the error distribution looks "like a Bell curve".
There is reasonable theoretical basis for the errors to follow a Bell
curve. The sum of many random plusses and minuses combine as in the
classic "random walk" problem to give a Gaussian probability distribution.
The probabilities for various outcomes of Gaussian distributed events are
given by the "Error Function" which is defined something like the integral
over a Gaussian probability distribution from its center to some desired
point (like 0.5) expressed in terms of the width of the distribution
(you've got to worry about the pesky square-root-2s). Probability outside
one sigma ~ 0.3; outside 2 sigma < .05; outside 3 sigma < 0.003; outside 4
sigma < 0.00006; outside 5 sigma < 6e-7; outside 6 sigma < 2e-9; outside 7
sigma < 3e-12; . . . outside 14 sigma < 2e-44. This function is tabulated
in reference books, but usually not out to within 10e-40 of the terminal
value. I suppose there are good approximations applicable out there
though. So outside about 14 sigmas you should be able to say the
probability is below 10e-40. The problem is that if there are small
deviations from "Gaussian-ness" way out on the wings of your distribution,
the REAL probability of a certain result is not well approximated by the
Error Function result. Even though there are reasonable theoretical
considerations to indicate that your error distribution "should" be
strictly Gaussian, any "weird", low probability-type effects that cause
small deviations from "Gaussian-ness" will make the "computed" probability
of, say, 10e-40 be wrong and give a false sense of security. This is why
people usually do what you instinctively did before and go with a "gut
feeling" about how many "sigmas out on the distribution" to go for safety.
Without a firm idea of what the "real" distribution is or unassailable
theoretical conviction that the errors "must" be Gaussian, a probability
calculation out to 10e-40 just gives false security. That said, you really
don't need to go out to 10e-40 anyway. The probability of having a
roundoff error greater than 0.5 over an entire LL test only needs to be
small compared to other typical error risks, like gamma rays, people using
excessive overclocking, or whatever effects cause the ~1% errors revealed
by doublechecking. You probably want a per iteration error greater than
0.5 to be much less than about 10e-10, say. So if 0.5 is outside about 6
or 7 sigmas of your error distribution, you would be OK, ASSUMING the
distribution is nearly Gaussian. You might look at the integral of the
error histograms you get to see if they deviate subtly from the error
function. I was unable to load the raw data you referenced; I'm using the
OLD Excel 4.0. If you want to send me some raw data readable by Excel 4.0
I would be happy to look at it and look for how faithfully it follows a
Gaussian.
Sincerely,
Todd Sauke
>Hi all,
>
> I'm working on version 19 of prime95 and I need your help.
>In the past, the exponents at which a larger FFT is used was picked
>rather haphazardly. I simply picked a few exponents trying to find
>one that could run a thousand or so iterations without the convolution
>error greatly exceeding 0.3.
>
> For this release, I thought it would be nice to use a more
>scientific approach. What I've done is run almost 10000 iterations
>of an exponent and used Excel to graph the number of iterations with
>an error between 0.200 and 0.201, 0.201 and 0.202, etc. As you can imagine
>you get a Bell-like curve that peaks at some value.
>
> Since it has been 25 years since I took statistics courses
>in college, I am at a loss as to how to use this graph to predict the
>probability that N iterations will have an error below M. I could
>then use such information to find the exponent such that the chance
>of a convolution error exceeding 0.5 is less than 1 in 10^40 (or some
>other criteria that we agree is "safe").
>
> Rather than mail these largish spreadsheets to the entire list,
>two sample graphs are at ftp://entropia.com/gimps/x1345000.xls and
>ftp://entropia.com/gimps/x1360000.xls (in the graphs a 400 on the X-axis
>is really a convolution error of 0.400, the Y-axis is a count of iterations).
>
> Any help is appreciated.
>
>Best regards,
>George
>
>P.S. A brief explanation of convolution errors: Since the FFT in prime95
>is done in floating point arithmetic, round-off errors build up. After the
>FFT, every value is rounded back to an integer. The convolution error is
>the largest amount that a result value differs from the correct integer
>value. If the convolution error exceeds 0.5, then the round-back-to-integer
>step will round to the wrong integer value - a catastrophic failure.
>
>Another way to look at this: Each FFT value contains about 20 bits of data.
>That is, values ranging from 2^19 to -2^19. Consider an FFT of 128K elements.
>If each FFT value was close to the 2^19 value, then each result value would
>need 19+19+17=55 bits to hold the result. Since a float only holds 53 bits
>there would be a loss of information. Fortunately, probability assures us
>that it is extremely unlikely that every (or even most) FFT values will
>be close to the 2^19 maximum value.
>
>
>
>________________________________________________________________
>Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
Date: Sat, 08 May 1999 10:53:37 -0400
From: Chris Nash <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Any statistics majors out there?
Hi George
> For this release, I thought it would be nice to use a more
> scientific approach. What I've done is run almost 10000 iterations
> of an exponent and used Excel to graph the number of iterations with
> an error between 0.200 and 0.201, 0.201 and 0.202, etc. As you can
imagine
> you get a Bell-like curve that peaks at some value.
>
> Since it has been 25 years since I took statistics courses
> in college, I am at a loss as to how to use this graph to predict the
> probability that N iterations will have an error below M. I could
> then use such information to find the exponent such that the chance
> of a convolution error exceeding 0.5 is less than 1 in 10^40 (or some
> other criteria that we agree is "safe").
It's been a while since I took statistics too, but it is worth a try. What
you would need to do in theory is calculate the mean and variance of your
sample. Use those to estimate the mean and variance of the entire
distribution. The mean is easy, the sample mean is your estimator. The
variance is harder and is n/(n-1) times the sample variance. From that, we
square root and get an estimate of the standard deviation. Finally we need
to verify that 0.5 is "several" standard deviations from the mean before we
can state 0.5 won't occur with any degree of confidence.
How many is "several?". To calculate this, assume a Gaussian distribution.
The distribution function of a distribution with mean zero and standard
deviation 1 is f(x)=1/sqrt(2pi). exp(-x^2/2). The integral (the error
function) is not of a closed-form, but the trick is to multiply two error
functions in x and y, and integrate using polar coodinates. If we integrate
within a circle of radius R, then the integral is larger than the
probability both x and y are less than R/sqrt(2) in absolute value, but
smaller than the probability that they are both less than R in absolute
value. That integral evaluates as 1-exp(-R^2/2). Remember that this
probability applies to both x and y. So we need to square root it
P(-R<x<R)>sqrt(1-exp(-R^2/2))
We require this to hold for N iterations, with probability
(1-exp(-R^2/2))^(N/2), about 1-(N/2)exp(-R^2/2).
We need to equate the LHS to 1-10^-40 to get the result you need. R about
sqrt(2.(40 log 10+log(N/2))) will do. In other words, you need to check 0.5
is more than this number of standard deviations from the mean in order to
get the required probability.
I tried this with the data you suggested. I found the mean was over 36
standard deviations from the critical point 0.5 in the 1,345,000 case (so
probably OK), while only 22 standard deviations in the 1,360,000 case, while
the cutoff as computed above was about 14.52 for each. Of course, these
estimates for mean and standard deviation *also* need to be tested to ensure
that they are correct within probability 1-10^-40, this was a step I didn't
do, but no doubt its effect is to increase the value of R in the formula
above.
No doubt someone with a little more statistics will be able to explain how
these results are affected by the assumption of a Gaussian distribution, no
doubt 14.52 is best-case. Perhaps a cutoff of 20 standard deviations would
allow some tolerance.
Looks plausible, though.
Chris Nash
Lexington KY
UNITED STATES
=======================================================
Co-discoverer of the 7th and 10th largest known primes.
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
------------------------------
End of Mersenne Digest V1 #553
******************************