On 4/11/2018 7:03 PM, Philip McGrath wrote:
From one following along who knows fairly little about floating-point
math (the Toronto/McCarthy paper looks very informative!):
On Tue, Apr 10, 2018 at 12:13 AM, George Neuner <gneun...@comcast.net
<mailto:gneun...@comcast.net>> wrote:
As Philip McGrath mentioned already, you can specify single
precision at least for input values ... but I think inexact (i.e.
floating point) math in Racket all is done at maximum hardware
precision and the result coerced back to the input precision if
possible. For purposes of numerical analysis, that is not the
same as performing all operations at the input precision.
Is this confirmed to be what Racket does?
Yes and no. I did not intend to attribute the issue to Racket - it's
really a hardware issue.
Racket follows the modern convention of using the CPU's SIMD hardware to
do FP math. SIMD hardware (typically) does not widen results, so you
get back the same precision you put in ... but internally there was an
intermediate double-wide mantissa computed that maybe didn't fit into
the output.
You see:
single X single -> single
double X double -> double
The hardware does:
single X single -> double -> single
double X double -> quad -> double
If the result overflows or underflows, you get INF or zero rather than a
correct but wider answer.
If you are trying to check your FPU using software emulations: i.e.
bigint-rationals or bigfloats, then you need to know exactly how the
computations are being done. More on that below.
Modern x86 have both SIMD and x87 co-processor. The SIMD unit was
introduced on the Pentium III with single precision and was expanded to
double precision on the Pentium IV and later. The modern convention to
use the SIMD unit exclusively leaves the x87 idle all the time. There's
a good reason for it - the SIMD unit is faster because it is more
tightly integrated ... the x87 originally was a separate chip and even
though it now is integrated, there still is a performance penalty for
having to use the co-processor interface protocol. And the SIMD unit
produces more consistent results [see below].
The x87 and the SIMD unit do not work the same way, and they may produce
different results from the same calculation. The SIMD unit offers
single or double precision. The x87 offers a 3rd choice: extended
precision - an 80 bit format that has 64 bits of mantissa [vs 53 bits in
an IEEE double]. Extended precision is the default mode of the x87.
The x87 always computes a full (64-bit mantissa) extended precision
result before rounding AT the selected precision and then truncating TO
the selected precision [which clears any trailing bits in the
register]. In the SIMD unit, precision is specified by the
instruction: the registers are (virtually) subdivided into slots which
are sized appropriately for the specified precision [no trailing bits],
and all slots are computed in parallel TO the specified precision.
Very different. And potentially confusing if you are expecting to get
bit-identical results.
GPU or DSP calculations can be even more challenging to check with a CPU
because their FPU hardware often is very different - not just not IEEE
compliant, but sometimes not even close.
Aside for Dale:
If you're on x86, and you find that rationals or bigfloats are too slow,
the x87 may offer a faster way to your more precise calculations.
see:
https://docs.racket-lang.org/reference/extflonums.html?q=math%2Fextflonum#%28tech._extflonum%29
If so, are the observable differences vs. the operations working at
the input precision because of error and error propagation
Yes. Consider what happens at the limits of the selected precision.
Switching to C for a moment, consider:
float X = FLT_MAX;
float Y = FLT_MAX - 4;
float Z = (X + Y) / 2;
float W = (float) (((double) X + Y) / 2);
Z gets compiled something like:
float t1 = X + Y
float t2 = t1 / 2
Z = t2
W get compiled something like:
double t1 = (double) X
double t2 = (double) Y
double t3 = t1 + t2
double t4 = t3 / 2
W = (float) t4
In the case of Z the temporaries are single precision, so (X + Y)
overflows to INF and that carries forward through the rest of the
calculation. In the case of W, the temporaries are double precision, so
the calculation of (X + Y) succeeds and W gets the right answer.
Now you don't see these temporaries [except maybe by compiling in debug
mode and specifying "consistent" floating point], but they exist in the
FPU registers. The actual answers you would get depend on the FPU
hardware and how the code was compiled.
E.g., on x87, keeping the temporaries in registers, both Z and W would
get the same (right) answer if the FPU were in double or extended
precision mode even though the compiler thinks the temporaries are
single precision values. But if the code were compiled for
"consistent" floating point - which forces temporaries to be truncated
to *storage* precision as if saved to memory and reloaded - then after
t1 computed some actual double precision value, that value would be
truncated to single precision regardless of FPU mode. The result would
be wrong, but it would not be INF as it would be with the SIMD unit or
if the x87 had been in single precision mode from the beginning.
Computing Z on the SIMD unit, t1 will always overflow to INF because the
internal double precision intermediate can't be stored back into the
single precision register. So the SIMD unit produces more consistent
results in keeping with programmer expectations. [another reason it is
preferred now]
(i.e., if starting from exact values, the implementation using
double-precision under the hood would sometimes produce results with
less error), or something else?
Double precision calculation MAY have less error than single precision
calculation ... it depends on whether temporaries over or under flow.
-Philip
Hope this helps,
George
--
You received this message because you are subscribed to the Google Groups "Racket
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.