Re: [Prime] reason a LL-test goes wrong (Was: Curious recent GIMPS member)

Richard Woods Thu, 26 Oct 2006 06:11:12 -0700

david eddy wrote:
> The important reason a LL-test goes wrong is due to
> using floating point FFT to perform exact integer arithmetic.


No, that's not it.

"Floating-point" arithmetic is (or should be, Intel's FDIV blunder 
notwithstanding) just as exact and deterministic as, though more complicated 
than, "integer" arithmetic.  Floating-point numbers are in fact composed of 
multiple exact integers, on which exact integer arithmetic is performed, but 
with the added complication of shifting, which may lead to losses of some 
result bits if the shifting takes them out of the range of the corresponding FP 
field.  (By "shifting" I am including truncation, rounding, scaling, and 
normalization.)  These losses are predictable and bounded, and proper 
programming can guarantee 100% accuracy of eventual integer results.  George 
_has taken such precautions_ in his FFT routines.  That's why you might 
occasionally see Prime95 (and its relatives) display a message about changing 
FFT size because of the magnitude of roundoff sums.

The actual reasons for erroneous L-L results are mistakes in hardware (or, 
sometimes, non-Prime95 software).  (Yes, it's possible that George's software 
still has bugs, but in practice, 99.99+% of errors lie elsewhere.)  Even the 
best hardware is subject to errors caused by cosmic rays crashing into the 
silicon chips.  Overheating (i.e., inadequate cooling) and overclocking can 
also cause errors in the FPU or elsewhere.
 
> As I am currently doing a double check I am interested in how the
> probability of an erroneous LL test varies as we go from the bottom
> to the top of the range of exponents for a given FFT size.

When operating on exponents near the top of an FFT range, Prime95's internal 
crosschecks will sometimes find roundoff errors high enough to warrant changing 
to the next higher FFT size, so as _not_ to produce an erroneous LL test result.

> The probability at the top of the range is presumably such  that the cost
> in time of an erroneous LL test balances the extra time needed for the
> next FFT size, for which the chance of error is small.

Prime95 is more proactive than that, nipping potential errors in the bud before 
they get large enough to affect the correctness of a result.  But because 
Prime95 depends on the hardware to perform its operations, it cannot detect all 
possible errors induced by the hardware effects mentioned above.  _That's_ the 
reason for doublechecking.
Richard Woods
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Re: [Prime] reason a LL-test goes wrong (Was: Curious recent GIMPS member)

Reply via email to