Mersenne: multi-exponent bug in Mlucas 2.6c

EWMAYER Wed, 22 Sep 1999 19:39:19 -0700
Dear Mersenners: I have bad news and good news. First the bad: David Willmore
has reported a bug in Mlucas 2.6c, which may cause some or all exponents 
beyond
the first in a multi-exponent (range) test to yield incorrect results.

For people doing first LL tests on large exponents, since v2.6 is only one
month old, you're probably still on exponent number 1, and thus should be OK.
The bug is most likely to affect double-checking (what David has been doing)-
I apologize for any wasted DC time that may have resulted due to this.

What was happening is this (sorry - this necessarily gets a tad technical):
when the code detects that a new exponent is being done, it deallocates
all the arrays, figures out the new FFT size, reallocates at the new size, 
and re-inits things like sincos data and DWT weights. No problem there.
But...the DWT weights are calculated starting with a parameter, the number
of "bigwords" (residue digits represented with respect to base = 
2^ceiling(p/n))
in the Crandall-Fagin mixed-radix representation. For exponent p and runlength
n, the number of bigwords is mod(p,n). This parameter was being properly reset
in the master squaring routine for each new p, but its value was also being
saved in one of the auxiliary routines (for carry propagation) and there it
was only being reset if the FFT length changed, not for a new p at the same n.
To see this in action, create a scratch directory, within which there is a
range file containing the following 2 small p's (both of which use n=512):

9689
11213

These are in fact both Mersenne prime exponents, but when you run v2.6c on
them, only the first is found to be prime. This possibility slipped through
my usual pre-release tests since (a) my self-tests used multiple p's, but
all with different n; (b) the incorrect bigwords parameter leads to an
incorrect carry propagation step, but the resulting residue digits are still
all whole numbers, i.e. you won't see any fatal roundoff errors as a result.

Thus, here is how this would affect your testing:

1) Any first exponent out of a range should be fine;
2) Any exponent preceded by one that yields a different runlength is OK;
3) Any exponent preceded by one that uses the same runlength will be bad.

If you're not sure where your current exponents fit into the above, you
can check whether these runs are OK or not as follows:

1) Stop the run in question.
2) Get the beta of Mlucas v2.7: <ftp://209.133.33.168/pub/mayer/README>.
3) Install the proper binary for your platform;
4) Copy the range file into one named worktodo.ini in your LL test directory.
   (you don't need to worry about renaming the savefiles - 2.7 uses Prime-95-
    like savefile names, i.e. your old ones won't get overwritten.)
5) Fire up v2.7. As soon it has done at least 2000 iterations, compare the
   2000-iteration Res64 (in the pXXXXX.stat file) to the analogous one in
   your old run's "status" file. If they're the same, stop v2.7 and let
   v2.6 finish the exponent (just restart it - no need to play with files).
   Note that if v2.6 was < 20% of the way through the exponent, it may be
   faster to just let 2.7 redo it from scratch - the comparative 2000-
   iteration timings in the files will tell you which will be faster.

(5) contains the good news - a beta of v2.7 is available, and it runs
significantly faster than v2.6, especially on the Alpha. See my forthcoming
posting about what's new and improved in v2.7. (Also see the README file.)

David W. also writes:

<< On the [CPU] which was given a mix of numbers  (112K fft and 128K fft size)
something odd is happening.  The Iteration time for the 112K fft was 3:08
(for 2K iterations).  Now, when it switched to the 128K size exponent, the
CPU time for 2K iterations went up to 4:34!!  The run which has always been
doing 128K size exponents has been taking 3:48 very consistently through
both exponents.  

Any idea what would cause that? >>

Possibly some weird cache behavior or interaction with other processes?
On the other hand, if you got 4:34 for the first 2000 squarings, much of
that may be initialization time, and the time should be lower for all
subsequent checkpoints.

On my 400 MHz Alpha 21164, v2.7 gives per-iteration times of .069 and .078
seconds at 112 and 128K, respectively. That translates into 2000-iteration
times of 1:44 and 1:57 (mm:ss) on your 533MHz 21164.

Best regards,
-Ernst
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: multi-exponent bug in Mlucas 2.6c

Reply via email to