(I don't write AT&T assembly so I am unable to make a patch. Nevertheless I hope someone who writes AT&T assembly could fix it.)
The x87 `sinl` instruction has been suffering from an accuracy problem since decades ago, which is described in this article: https://software.intel.com/blogs/2014/10/09/fsin-documentation-improvements-in-the-intel-64-and-ia-32-architectures-software Long story short: Before we can calculate `sin(x)`, we have to reduce `x` such that it falls in (-π/2,π/2]. This can be easily done via dividing `x` with π and getting the remainder. The problem is that, instead of a reasonably accurate value of π, `FSIN` uses a 66-bit approximate value as the divisor, which makes the result very inaccurate if `x` is proximate to some multiple of π, because the remainder would end up with most of its upper bits being zero and very few significant bits left. To compromise with Intel people, as the article suggests, it is essential to reduce `x` before executing the `fsin` instruction. This is done as follows: 1. Use `FLDPI` instruction to get an accurate value of π. 2. Run `FPREM1` instruction repeatly until the _C2_ bit in FPU status word is cleared. The result remainder will be in (-π/2,π/2], and the _C0_,_C3_,_C1_ bits are the least three significant bits of the quotient, from left to right. 3. Calculate the sine value using `FSIN` instruction. This never fails. 4. Acknowledging that `sin(x) = -sin(x+kπ)` when `k` is odd and `sin(x) = sin(x+kπ)` when `k` is even, because the parity bit of the quotient is the _C1_ bit in the FPU status word, if it is set, negate the result with `FCHS` instruction. We get the sine value now. The above process is the same for cosine. In the case of tangent, step 4 should be removed. The following code fragment compares `sinl` from mingw-w64 and my own implementation: volatile auto one = 1.0l; auto theta = atanl(one) * 4; // This function is from mingw-w64. std::printf("sinl (theta) = %.16Le\n", sinl (theta)); std::printf("my_sinl(theta) = %.16Le\n", my_sinl(theta)); It produces the following result: sinl (theta) = 1.6263032587282567e-019 my_sinl(theta) = -0.0000000000000000e+000 My implementation could be found here: https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/sin.c#L12 static inline long double fpu_sin(long double x){ unsigned fsw; const long double reduced = __MCFCRT_fremainder(&fsw, x, __MCFCRT_fldpi()); long double ret = __MCFCRT_fsin_unsafe(reduced); if(fsw & 0x0200){ ret = __MCFCRT_fneg(ret); } return ret; } -------------- Best regards, lh_mouse 2016-09-07 ------------------------------------------------------------------------------ _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
