Reading `mingw-w64/mingw-w64-crt/math/fmal.c`:

    long double
    fmal ( long double _x,  long double _y,  long double _z)
    {
      return ((_x * _y) + _z);
    }

This implementation is completely wrong.

https://en.wikipedia.org/wiki/Multiply–accumulate_operation#Fused_multiply.E2.80.93add
The multiplication in a single FMA operation must behave as if
the result had infinite precision. That is, multiplying
two x87-extended-precision floating point numbers
(1 sign + 15 exp + 64 frac = 80 bits) yields a result of 144 bits
(1 sign + 15 exp + 64 frac * 2 = 144 bits).

For example, with a conforming `fmal()`, the expression
`fmal(1.000000002l, 3.000000004l, -3.000000010l)` shall yield
approximately `8e-18`, because `1.000000002l * 3.000000004l` yields
`3.000000010000000008l`. But in mingw-w64, this indeterminate result
is truncated when converted to `long double`, yielding `3.00000001l`,
and adding `-3.000000010l` to it yields zero.

Since x87 does not have 128-bit registers, FMA must be done in software:
1. Split both multiplier into higher and lower parts. Since a `long double`
    has 64 significant bits (it does not have a hidden bit), either of the two 
parts
    has to have 32 bits so we don't get precision losses when multiplying them.
2. Keeping in mind that `(a+b)(c+d)=ac+ad+bc+bd`, calculate the sum
    IN THE FOLLOWING ORDER:

    long double ret = z;
    ret += xhi * yhi;
    ret += xhi * ylo + xlo * yhi;
    ret += xlo * ylo;


A conforming implementation can be found here:
https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/fma.c
                                
--------------
Best regards,
lh_mouse
2016-09-08


------------------------------------------------------------------------------
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to