On 4/11/2018 7:03 PM, Philip McGrath wrote:
From one following along who knows fairly little about floating-point math (the Toronto/McCarthy paper looks very informative!):

On Tue, Apr 10, 2018 at 12:13 AM, George Neuner <gneun...@comcast.net <mailto:gneun...@comcast.net>> wrote:

    As Philip McGrath mentioned already, you can specify single
    precision at least for input values ... but I think inexact (i.e.
    floating point) math in Racket all is done at maximum hardware
    precision and the result coerced back to the input precision if
    possible.  For purposes of numerical analysis, that is not the
    same as performing all operations at the input precision.


Is this confirmed to be what Racket does?

Yes and no.  I did not intend to attribute the issue to Racket - it's really a hardware issue.


Racket follows the modern convention of using the CPU's SIMD hardware to do FP math.  SIMD hardware (typically) does not widen results, so you get back the same precision you put in ... but internally there was an intermediate double-wide mantissa computed that maybe didn't fit into the output.

   You see:
      single X single -> single
      double X double -> double

   The hardware does:
      single X single -> double -> single
      double X double -> quad -> double

If the result overflows or underflows, you get INF or zero rather than a correct but wider answer.


If you are trying to check your FPU using software emulations:  i.e. bigint-rationals or bigfloats, then you need to know exactly how the computations are being done.  More on that below.



Modern x86 have both SIMD and x87 co-processor.  The SIMD unit was introduced on the Pentium III with single precision and was expanded to double precision on the Pentium IV and later.   The modern convention to use the SIMD unit exclusively leaves the x87 idle all the time.  There's a good reason for it - the SIMD unit is faster because it is more tightly integrated ... the x87 originally was a separate chip and even though it now is integrated, there still is a performance penalty for having to use the co-processor interface protocol.  And the SIMD unit produces more consistent results [see below].

The x87 and the SIMD unit do not work the same way, and they may produce different results from the same calculation.  The SIMD unit offers single or double precision.  The x87 offers a 3rd choice: extended precision - an 80 bit format that has 64 bits of mantissa [vs 53 bits in an IEEE double].  Extended precision is the default mode of the x87.

The x87 always computes a full (64-bit mantissa) extended precision result before rounding AT the selected precision and then truncating TO the selected precision [which clears any trailing bits in the register].   In the SIMD unit, precision is specified by the instruction: the registers are (virtually) subdivided into slots which are sized appropriately for the specified precision [no trailing bits], and all slots are computed in parallel TO the specified precision.

Very different.  And potentially confusing if you are expecting to get bit-identical results.


GPU or DSP calculations can be even more challenging to check with a CPU because their FPU hardware often is very different - not just not IEEE compliant, but sometimes not even close.



Aside for Dale:
If you're on x86, and you find that rationals or bigfloats are too slow, the x87 may offer a faster way to your more precise calculations. see: https://docs.racket-lang.org/reference/extflonums.html?q=math%2Fextflonum#%28tech._extflonum%29



If so, are the observable differences vs. the operations working at the input precision because of error and error propagation

Yes.  Consider what happens at the limits of the selected precision.  Switching to C for a moment, consider:

   float X = FLT_MAX;
   float Y = FLT_MAX - 4;

   float Z = (X + Y) / 2;
   float W = (float) (((double) X + Y) / 2);

Z gets compiled something like:
   float t1 = X + Y
   float t2 = t1 / 2
   Z = t2

W get compiled something like:
    double t1 = (double) X
    double t2 = (double) Y
    double t3 = t1 + t2
    double t4 = t3 / 2
    W = (float) t4

In the case of Z  the temporaries are single precision, so (X + Y) overflows to INF and that carries forward through the rest of the calculation.  In the case of W, the temporaries are double precision, so the calculation of (X + Y) succeeds and W gets the right answer.

Now you don't see these temporaries [except maybe by compiling in debug mode and specifying "consistent" floating point], but they exist in the FPU registers.   The actual answers you would get depend on the FPU hardware and how the code was compiled.

E.g., on x87, keeping the temporaries in registers, both Z and W would get the same (right) answer if the FPU were in double or extended precision mode even though the compiler thinks the temporaries are single precision values.   But if the code were compiled for "consistent" floating point - which forces temporaries to be truncated to *storage* precision as if saved to memory and reloaded -  then after t1 computed some actual double precision value, that value would be truncated to single precision regardless of FPU mode.  The result would be wrong, but it would not be  INF as it would be with the SIMD unit or if the x87 had been in single precision mode from the beginning.

Computing Z on the SIMD unit, t1 will always overflow to INF because the internal double precision intermediate can't be stored back into the single precision register.  So the SIMD unit produces more consistent results in keeping with programmer expectations. [another reason it is preferred now]


(i.e., if starting from exact values, the implementation using double-precision under the hood would sometimes produce results with less error), or something else?

Double precision calculation MAY have less error than single precision calculation ... it depends on whether temporaries over or under flow.


-Philip

Hope this helps,
George

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to