At 02:57 PM 2/3/01 -0600, you wrote:

> >With increasing exponent size (and therefore run time), I'd like to
> >see PrimeNet evolve to track intermediate residues & also to be able
> >to coordinate parallel LL testing & double-checking, so that runs
> >which are going wrong can be stopped for investigation without having
> >to be run through to the end.
>
>In the QA effort, we've seen a few instances already of errors caught
>midway by doing a manual/email version of this.  Brian Beesley had an error
>detected this way in his run of a double-check of a 10-megadigit exponent.
>This exponent takes a PII-400 428 days (yes 14 months) to complete,
>so detecting the one error and restarting early saves about 10.5 PII-400
>months.

I think this is an EXCELLENT idea, but remember that the "s" values (i.e. 
the intermediate residue/modulus) for such numbers is quite simply 
enormous.   One couldn't (and shouldn't) check the entire intermediate 
value, but merely the last "x" bits, where "x" is enough to be reasonably 
certain that a match isn't random chance -- say, the final 1024 bits.

PrimeNet would thus also have to carefully assign the exponents to similar 
machines with similar runtimes and performance, as it would do little good 
to assign the primary test to an Athlon-800 and the "real-time" 
double-check to a much slower machine, as the Athlon would quickly outpace 
the second check.

If a discrepancy was found in a real-time double-check, a ternary run on a 
different machine could determine which (if either) of the two intermediate 
residuals was correct, and the tests could proceed from there, with both 
original machines assuming the same correct residue.

Also, if this did evolve, I'd suggest that the "double-checker" be given 
equal credit with the primary machine, for purposes of credit in history 
books as discoverers, and/or EFF monies.

>I assume that Brian means sending intermediate 64-bit residues to Primenet
>for comparison.  (The intermediate save files are too big to send with any
>frequency and would require a lot of storage.)
>
>To automate checking via interim residues would require significant longterm
>storage at primenet, of quadruples containing exponent, iteration, 64-bit
>residue, and the source of the information (person or machine ID).  When 
>two with matching exponent and iteration but different source were available a
>comparison would be made; if a discrepancy was found, both runs should be 
>halted while a tiebreaker run was made via a different source, to avoid 
>wasting cpu time of one or both original sources.  Since the most likely 
>cause of a discrepancy is error in one run not both, a resume capability 
>as well as a discard capability would be needed.  I feel exponents halted 
>for a tiebreaker run should not be expired.

I'd agree.  Machines awaiting a tie-breaker could move on to factoring, or 
another smaller double-check.  I would not want to see such machines begin 
another 14-month effort, as once the tiebreaker concluded, that work would 
be suspended while the first test was concluded.

Note that there's a point of futility, at which a "tie-breaker" ought to 
merely be a triple-check, run to conclusion.  Let's say on a 14-month co-op 
effort, 13.6 months into it a discrepancy was found.   Both machines ought 
to finish, and just have it triple-checked, rather than suspending both, 
awaiting a tiebreaker.   While I'm sure someone could solve for the optimum 
cutoff point where tiebreakers are not useful, my guess would be that it is 
around 85% of the way to completion.
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to