I agree with everything that Lucas said.
For the project to benefit, the participants would have to AGREE
to restructure the work. Savings would result from stopping the
diverging work earlier. (And if "passing around" intermediate
files was accepted, by starting any triple-check work later.)
In essence, the idea is to take that 1_year_to_complete LL test of
an exponent and BREAK IT DOWN into (n) discrete serial work units.
mikus
Proposed procedural steps:
(b) When 'Tester 1' requests work, assign ONLY work unit (1) to him.
Howewver, the project RESERVES exponent (EEEE) to 'Tester 1' -
only *he* will (in the future) be assigned initial testing of
further work units within exponent (EEEE).
(c) Eventually, Tester 1 finishes that work unit, and begins to
work on some other work unit on some other exponent. HOWEVER,
'Tester 1' saves the intermediate file (of how far he got with
exponent EEEE).
Somewhere along the line, exponent (EEEE) work unit (1) becomes
available to be assigned to a double-checker.
(d) Now, when 'Checker 2' requests work, assign exponent (EEEE) work
unit (1) to him for double-checking.
Since this is work unit (1), there is as yet no intermediate
file for exponent (EEEE). If, however, this was a subsequent
work unit, and Checker 2 was *not* the person who performed
the preceding work unit's test/check, then Checker 2 might
OBTAIN the intermediate file from whoever did that preceding
work unit.
(e) When Checker 2 finishes work unit (1) of exponent (EEEE), his
(work unit "final") residue is compared to the residue reported
for work unit (1) by Tester 1. HOWEVER, 'Checker 2' saves the
intermediate file (of how far he got with exponent EEEE).
(f) If the residues match, exponent (EEEE) work unit (2) becomes
available to be assigned to 'Tester 1'.
Now, 'Tester 1' and 'Checker X' repeat steps (b) - (e), until
the residues of the FINAL work unit on exponent (EEEE) agree.
----
If the residues of the most recently tested AND checked work
unit of exponent (EEEE) do not match, then that work unit
becomes available to be assigned to a triple-checker.
(g) Now, when 'Checker 3' requests work, assign the diverging work
unit within exponent (EEEE) to him for re-checking.
Checker 3 might OBTAIN from Tester 1 the intermediate file for
exponent (EEEE) as of two work units back. Checker 3 will
perform the first of those two work units (i.e., the one
previous to the divergence) to verify that his (work unit
"final") residue matches the (known good) one from Tester 1.
(h) Checker 3 will then go on to triple-check the diverging work
unit. When Checker 3 finishes that work unit of exponent
(EEEE), his (work unit "final") residue is compared to the
residue reported by Tester 1.
(i) If the residues match, the next work unit of exponent (EEEE)
becomes available to be assigned to 'Tester 1', and testing
and checking continue the same as with a successful step (f).
----
If Checker 3's (work unit "final") residue does not match
Tester 1's, but does match Checker 2's, then exponent (EEEE)
is un-reserved from 'Tester 1', and is instead RESERVED to
'Checker 2'. Testing and checking of the work units of (EEEE)
continue with step (i), where 'Checker 2' replaces 'Tester 1'.
----
(k) If Checker 3's (work unit "final") residue on the diverging
work unit of exponent (EEEE) matches neither Tester 1's nor
Checker 2's, maybe the best thing to do is to start all over
with a new set of participants.
Comment: If the breaking down of a long LL computation into discrete
units of work is adopted, thought might be given to having
prime95 generate a "universal interchange" intermediate
file at the end of a unit of work - in other words, special
output that would be acceptable as input across a number of
operating systems and/or hardware chip types.
Note: For the sake of "fairness", becoming the 'reserved owner'
on a 10-unit exponent might require 10 'credits' earned
through doing checking.
In article <[EMAIL PROTECTED]>,
Lucas Wiman <[EMAIL PROTECTED]> wrote:
> > This idea is rather obvious, and no, I don't remember seeing it either.
>
> This had been discussed earlier. Brian and I talked about it for a little
> while, he came up with the original idea.
>
> > I think the idea has definite merit. If an error does occur, it's equally
> > likely to happen at any step along the way, statistically. Errors are every
> > bit as likely to happen on the very first iteration as they are during the
> > 50% mark, or the 32.6% mark, or on the very last iteration.
>
> True, but if the system is malfunctioning then the errors should start
> early.
>
> > Especially as the exponents get larger and larger, I see a *definite*
> > possibility to reduce double check times by having first time LL tests
> > report residues at certain "percentages" along the way.
>
> Yeah. The error rate should be proportional to the runtime which is increases
> with the square of the exponent (ouch!).
>
> > Just for example, every 10% along the way, it'll send it's current residue
> > to the Primenet server.
>
> I'm guessing that you mean a certain amount of the residue. Sending in
> 10 2meg files for *each* exponent in the 20,000,000 range would get very
> unwieldy, and inconvenient for people and primenet.
>
> Of course, this would only help if we were running more one test for the
> same exponent at the same time (otherwise, this would just be a pointless
> way to do a triple check). They would either have to be coordinated
> (running at the same time, logistical knightmare), or (as Brian suggested)
> have a "pool" of exponents running on one computer. That is to say when
> one computer finishes to X%, it reports its 64-bit residue to primenet, and
> waits for the second computer working on the same LL test to do the same.
> Until the other (slower) computer reports in, the (faster) computer works on
> another exponent.
>
> This would speed up the entire project, but it would slow down the individual
> exponent, which would make people mad :(.
>
> > I forget the numbers being tossed around,
> > but you'd only save 50% of (the error rate) of the
> > checking time.
>
> As I pointed out above, the error rate should increase with the square of the
> exponent (plus change). This means that if 1% have errors at 7mil, 22% will
> have errors at 30mil.
>
> -Lucas Wiman
>
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers