On 7 May 00, at 17:41, [EMAIL PROTECTED] wrote:

> A large significant (non-error)
> part of the convolution coefficient means that any accumulated rounding
> errors will collect in the least-significant few bits of the floating-point
> mantissa. That's why errors close to 0.5 tend to come in the form of
> (integer)/(small power of 2).
> 
> b) Especially for large runlengths (and after the first few hundred iterations
> or so), rounding errors tend to be randomly distributed in an approximately
> Gaussian fashion,

I know perfectly well what you mean, but these two statements tend to 
contradict each other. Gaussian distributions are continuous & 
smooth, we have instead a discrete distribution whose gaps tend to 
increase with size.

If we have a mechanism which tends to "chop down" the result of a 
floating-point operation towards zero when the FPU register isn't 
accurate enough to contain it all, and we're down to 4 guard bits, 
then an actual rounding error of 0.499999 could be represented as 
0.4375, i.e. there is almost no safety at all between 0.4375 and 0.5. 
If we _ever_ see a value between 0.4375 and 0.5, we must have more 
than 4 guard bits, and the logic of your argument is that we are 
probably OK if we see 0.4375's, but not 0.46875's.

We should probably find out how much safety we have - i.e. can we 
provoke a 0.46875, or even a 0.484375.

As for fixing the problem (when we are in a position to make a 
rational analysis of what constitutes a problem!) there would seem to 
be an automatic recovery method, as follows:

1) Unpack the work vector for the previous iteration & recover the 
true residual.
2) Generate a new work vector for the next FFT run length & pack the 
residual into it.
3) Run one iteration - i.e. re-run the iteration that caused the 
excess roundoff panic.
4) Unpack the work vector & recover the true residual after running 
this single iteration.
5) Generate a new work vector for the original FFT run length & pack 
the residual into it.
6) Continue as if nothing had happened.

This will keep the speed high except for a few isolated iterations. 
The argument here is that running the whole test with a larger FFT 
run length is computationally expensive; it's very likely to be 
unneccessary (unless the FFT run length breakpoint is grossly 
incorrect - in which case a run will contain a great many 
"recoveries"); finally, even if an excess rounding error does slip 
through & cause an incorrect result, double-checking with a different 
program will pick it up.

Prime95 is a bit different:

The extra 11 bits of mantissa in the Intel FPU registers cause the 
roundoff errors to be apparently much more smoothly distributed, so, 
even if we do see the odd 0.45..., we _might_ still be OK.

Prime95 working in its default mode only checks convolution error 
every 128 iterations, so there's _some_ chance that things do, in 
fact, slip through. The safeguard here is that repeating the run with 
a different offset (as in a double-check run) will give a different 
distribution of roundoff errors, and (if the calculation went wrong 
as a result of an excess rounding error) it will not go wrong at the 
same place again - so the final residual will be different.


Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to