Reading `mingw-w64/mingw-w64-crt/math/fmal.c`: long double fmal ( long double _x, long double _y, long double _z) { return ((_x * _y) + _z); }
This implementation is completely wrong. https://en.wikipedia.org/wiki/Multiply–accumulate_operation#Fused_multiply.E2.80.93add The multiplication in a single FMA operation must behave as if the result had infinite precision. That is, multiplying two x87-extended-precision floating point numbers (1 sign + 15 exp + 64 frac = 80 bits) yields a result of 144 bits (1 sign + 15 exp + 64 frac * 2 = 144 bits). For example, with a conforming `fmal()`, the expression `fmal(1.000000002l, 3.000000004l, -3.000000010l)` shall yield approximately `8e-18`, because `1.000000002l * 3.000000004l` yields `3.000000010000000008l`. But in mingw-w64, this indeterminate result is truncated when converted to `long double`, yielding `3.00000001l`, and adding `-3.000000010l` to it yields zero. Since x87 does not have 128-bit registers, FMA must be done in software: 1. Split both multiplier into higher and lower parts. Since a `long double` has 64 significant bits (it does not have a hidden bit), either of the two parts has to have 32 bits so we don't get precision losses when multiplying them. 2. Keeping in mind that `(a+b)(c+d)=ac+ad+bc+bd`, calculate the sum IN THE FOLLOWING ORDER: long double ret = z; ret += xhi * yhi; ret += xhi * ylo + xlo * yhi; ret += xlo * ylo; A conforming implementation can be found here: https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/fma.c -------------- Best regards, lh_mouse 2016-09-08 ------------------------------------------------------------------------------ _______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public