在 2019/12/16 下午4:43, Martin Storsjö 写道: > On Fri, 13 Dec 2019, Liu Hao wrote: > >> Side note: The test may fail on x86 without excess casts, due to the >> fact that x87 extending format is used for intermediate results, which >> always cause double-rounding errors. In order for the test to pass, >> hardware single-precision floating-point arithmetic must be enabled by >> using GCC's `-msse -mfpmath=sse` options. > > Do you mean in the testcase itself, or in the implementation of the fma > function? >
The former. It is exactly why `fmaf()` uses three `+=` instead of adding
all parts together in a single expression. The `+=` operator, being an
assignment operator, forces rounding to `float` if `FLT_EVAL_METHOD` is
zero. Probably I should have added `#pragma STDC FP_CONTRACT OFF`. I
will try that later today.
> If the latter, should we build that part of mingw-w64-crt with specific
> compiler flags, or perhaps with the implementation converted to assembly
> (either inline or separate) to enforce the right form?
>
>> +static inline void break_down(iec559_double *restrict lo, iec559_double
>> *restrict hi, double x) {
>> + hi->f = x;
>> + /* Erase low-order significant bits. `hi->f` now has only 32
>> significant bits. */
>
> 32 bits? I can't get that number to add up properly, regardless if
> including the sign/exponent or omitting it...
>
Oops sorry it was copied from the `long double` variant. Perhaps all
such comments are wrong. Will fix them later.
>> +static inline void break_down(iec559_float *restrict lo, iec559_float
>> *restrict hi, float x) {
>> + hi->f = x;
>> + /* Erase low-order significant bits. `hi->f` now has only 32
>> significant bits. */
>
> Ditto
>
>
> Other than these details, the implementations look good to me.
>
I am still unsure about rounding errors in this case:
We want to calculate `FMA(X, Y, Z)`:
Let `X = X1 + X2` and let `Y = Y1 + Y2`. Then we have
FMA(X, Y, Z)
= Z + X * Y
= Z + (X1 + X2) (Y1 + Y2)
= Z + X1 Y1 + (X1 Y2 + X2 Y1) + Y1 Y2
^ ^^^^^ ^^^^^^^^^^^^^^^ ^^^^^
(1) (2) (3)
I am afraid that `Z + (1) + (2)` with infinite precision may produce a
roundoff larger than 1/2 ULP which should be rounded up, but using two
non-infinite addition could drop that bit. This is why I updated the
patch so now `X1 Y1` has fewer significant bits.
Assuming `double`:
|---- Z -----| ; 53 bits
|-- X1 Y1 --| ; 52 bits
|-- X1 Y2 ---| ; 53 bits
|--- X2 Y1 --| ; 53 bits
|--- X2 Y2 ---| ; 54 bits <inexact>
^MSB ^ULP
I coundn't proof that there will be no rounding errors in this case. The
only thing I know for sure at the moment is that if no number is
denormal then the MSB of both `X1` and `Y1` are ones.
Comments are welcome.
--
Best regards,
LH_Mouse
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
