>> There is good reason to believe that George's implementation of
>> double-checking on Intel will be not as fast as the "primary"
>> checking code.
>
> It will be just as fast (well... a few microseconds slower)
>
>> Last I heard, he was planning to make the iteration logic
>> R <- 2*R ; R <- R^2 ; R <- R/4 ; R <- R-2
>
> That is the concept. The implementation is this:
> Shift the first value in the Lucas sequence by a random amount.
> This shift amount represents the position of the "units" bit.
> Square the value. The units bit has now moved - to double where
> it used to be. Subtract two given the new position of the units bit.
>
> Sorry if the above isn't real clear. The effect is that the
> program need only do a little bit of extra bookkeepping
George, I think the critical point which I don't quite understand is,
do you do the shift in k bits & shift out 2k bits for each iteration
(in which case I can't see how the extra CPU cycles are +/- "free"),
or do you just shift in k bits before starting the first iteration &
shift out (2k)^(N-2) bits after finishing the last ((N-2)th)? In the
latter case, I understand that the overhead is minimal, but is there
really going to be anything left in the result field after you've
thrown away all those bits ... there's only N bits in the
whole residual!
So far as subtracting the 2 is concerned, this isn't a problem, I did
say in my message that collapsing this step into the shift out was
possible.
> The benefit is that the FFT is dealing with very different data.
> I ran the idea past Richard Crandall and he saw no problems
> (not that he did a detailed analysis).
George, you are a *real* expert in this area, so is Richard Crandall,
if you're both happy with the concept, then I'm definitely satisfied!
> As a special bonus, when two users accidentally test the same
> exponent, we will have a useful double-check instead of a
> wasted computation.
I'd tend to agree, except for the fact that *some* pure repeats are
surely "useful", in the sense that we can use the results to give an
estimate of the reliability of any given test result. Random errors
can creep in from lots of places. Most PCs have only non-parity
memory, & random one-bit, one-cycle memory failures are known, even
if rare.
On an operational matter, IMHO it might be a good policy to have
"classic" Pentiums prefer double checks whilst PIIs prefer primary
tests - or have the break-point based on CPU speed (150 MHz?) instead
of processor type - the point being that this would tend to prevent
all the quick tests being finished quickly by fast machines, leaving
us with the same problem of shortage of work which can be achieved in
a sensible time by slower systems.
Regards
Brian Beesley