On 7 May 00, at 17:41, [EMAIL PROTECTED] wrote:
> A large significant (non-error)
> part of the convolution coefficient means that any accumulated rounding
> errors will collect in the least-significant few bits of the floating-point
> mantissa. That's why errors close to 0.5 tend to come in the form of
> (integer)/(small power of 2).
>
> b) Especially for large runlengths (and after the first few hundred iterations
> or so), rounding errors tend to be randomly distributed in an approximately
> Gaussian fashion,
I know perfectly well what you mean, but these two statements tend to
contradict each other. Gaussian distributions are continuous &
smooth, we have instead a discrete distribution whose gaps tend to
increase with size.
If we have a mechanism which tends to "chop down" the result of a
floating-point operation towards zero when the FPU register isn't
accurate enough to contain it all, and we're down to 4 guard bits,
then an actual rounding error of 0.499999 could be represented as
0.4375, i.e. there is almost no safety at all between 0.4375 and 0.5.
If we _ever_ see a value between 0.4375 and 0.5, we must have more
than 4 guard bits, and the logic of your argument is that we are
probably OK if we see 0.4375's, but not 0.46875's.
We should probably find out how much safety we have - i.e. can we
provoke a 0.46875, or even a 0.484375.
As for fixing the problem (when we are in a position to make a
rational analysis of what constitutes a problem!) there would seem to
be an automatic recovery method, as follows:
1) Unpack the work vector for the previous iteration & recover the
true residual.
2) Generate a new work vector for the next FFT run length & pack the
residual into it.
3) Run one iteration - i.e. re-run the iteration that caused the
excess roundoff panic.
4) Unpack the work vector & recover the true residual after running
this single iteration.
5) Generate a new work vector for the original FFT run length & pack
the residual into it.
6) Continue as if nothing had happened.
This will keep the speed high except for a few isolated iterations.
The argument here is that running the whole test with a larger FFT
run length is computationally expensive; it's very likely to be
unneccessary (unless the FFT run length breakpoint is grossly
incorrect - in which case a run will contain a great many
"recoveries"); finally, even if an excess rounding error does slip
through & cause an incorrect result, double-checking with a different
program will pick it up.
Prime95 is a bit different:
The extra 11 bits of mantissa in the Intel FPU registers cause the
roundoff errors to be apparently much more smoothly distributed, so,
even if we do see the odd 0.45..., we _might_ still be OK.
Prime95 working in its default mode only checks convolution error
every 128 iterations, so there's _some_ chance that things do, in
fact, slip through. The safeguard here is that repeating the run with
a different offset (as in a double-check run) will give a different
distribution of roundoff errors, and (if the calculation went wrong
as a result of an excess rounding error) it will not go wrong at the
same place again - so the final residual will be different.
Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers