Paul Leyland (whom I'll meet for the second time later this month) writes:
> > From: Brian J. Beesley [mailto:[EMAIL PROTECTED]]
>
> > A couple of points here:
> > (1) Can anyone honestly commit to a project for a whole decade?
> Yes, they can and do. In the field of computational number theory, several
> participants in the Cunningham project have been active much longer than my
> decade. Sam Wagstaff has been co-ordinating it longer than that. To be
> honest, I don't know when he started. Perhaps Peter Mon tgomery (another
> long-time participant) can inform us, as I know he's also on the Mersenne
> list.
> In other fields, it's not unusual to commit to a decade or more. Consider
> the deep space missions such as Voyager, or the astronomers who monitor a
> particular class of objects for most of their working lives. I know some
> amateurs who have been observing particular variable stars since the 1960's;
> George Alcock has been searching for comets and novae since the late
> fifties.
Ten years for one computation seems a long time. Yes, I have been
contributing to the Cunningham project for a long time, using machines
at multiple institutions, but we can hardly commit for ten years.
Jobs change today, at least in the USA.
A home machine may be stolen or damaged by fire or floods.
My family wouldn't know what to do if I died.
For the very long LL computations, it is reasonable for two sites
A and B to start the computation, and do a checkpoint perhaps
every 1000 iterations. When A reaches its 7000-th iteration
(or other multiple of 1000), it sends the checkpoint file to a central site,
and can continue for another 1000 iterations if B has submitted a
consistent checkpoint for the 6000-th iteration, with different
programs used for iterations 5001-6000.
Otherwise A is blocked (i.e., told to do something else until B catches up).
If A and B differ at the 6000-th iteration checkpoint,
or if one times out (say six months with no activity),
a third site can restart from the (agreed) 5000-th iteration checkpoint
It is important that the different programs (e.g., Prime95,
MacLucas, LucasMayer) agree on a common checkpoint format. Not only must the
central server be able to compare them, but a contributor might migrate from
a Pentium in 1996 to an Alpha today to a Merced in 2002.
Of course this data at the central site will be voluminous,
if each exponent around 100 M averages 2.5 13 Mb checkpoint files
(for iterations 5000 and 6000 and possibly 7000 in the example).
If there are 100000 LL tests in progress, then
the central site will need 3250 Gb. A tetrabyte is huge today but
may be common in 15 years. In the 1970's much data was on
seven-track tapes. I recall each held 1.9 M 60-bit words (14 Mb)
Today the NFS data for M619 is being processed on a filesystem with 16 Gb,
a 1100-fold increase over 25 years.
Optimizations can reduce the size stored. For example, store the full
5000-th iteration checkpoint, but only hash(6000-th iteration) and
hash(7000-th iteration) where the hash function might return only 1024
bits. When the second site sends a 6000-th iteration checkpoint,
verify that it gives the same hash value. If yes, then we assume
B's full file agrees with A's, and store the 6000-th iteration checkpoint
while deleting the 5000-th. If no, the full 5000-th remains available
for a third site to use.
Peter
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers