(I don't write AT&T assembly so I am unable to make a patch.
Nevertheless I hope someone who writes AT&T assembly could fix it.)

The x87 `sinl` instruction has been suffering from an accuracy problem
since decades ago, which is described in this article:
https://software.intel.com/blogs/2014/10/09/fsin-documentation-improvements-in-the-intel-64-and-ia-32-architectures-software

Long story short: Before we can calculate `sin(x)`, we have to reduce `x`
such that it falls in (-π/2,π/2]. This can be easily done via dividing `x`
with π and getting the remainder. The problem is that, instead of
a reasonably accurate value of π, `FSIN` uses a 66-bit approximate value
as the divisor, which makes the result very inaccurate if `x` is proximate to
some multiple of π, because the remainder would end up with most of
its upper bits being zero and very few significant bits left.

To compromise with Intel people, as the article suggests, it is essential
to reduce `x` before executing the `fsin` instruction. This is done as follows:

1. Use `FLDPI` instruction to get an accurate value of π.
2. Run `FPREM1` instruction repeatly until the _C2_ bit in FPU status word
    is cleared. The result remainder will be in (-π/2,π/2], and 
    the _C0_,_C3_,_C1_ bits are the least three significant bits of
    the quotient, from left to right.
3. Calculate the sine value using `FSIN` instruction. This never fails.
4. Acknowledging that `sin(x) = -sin(x+kπ)` when `k` is odd and
    `sin(x) = sin(x+kπ)` when `k` is even, because the parity bit of the 
quotient
    is the _C1_ bit in the FPU status word, if it is set, negate the result with
    `FCHS` instruction. We get the sine value now.

The above process is the same for cosine.
In the case of tangent, step 4 should be removed.


The following code fragment compares `sinl` from mingw-w64 and
my own implementation:

    volatile auto one = 1.0l;
    auto theta = atanl(one) * 4; // This function is from mingw-w64.
    std::printf("sinl   (theta) = %.16Le\n", sinl   (theta));
    std::printf("my_sinl(theta) = %.16Le\n", my_sinl(theta));

It produces the following result:

    sinl   (theta) = 1.6263032587282567e-019
    my_sinl(theta) = -0.0000000000000000e+000

My implementation could be found here:
https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/sin.c#L12

    static inline long double fpu_sin(long double x){
        unsigned fsw;
        const long double reduced = __MCFCRT_fremainder(&fsw, x, 
__MCFCRT_fldpi());
        long double ret = __MCFCRT_fsin_unsafe(reduced);
        if(fsw & 0x0200){
            ret = __MCFCRT_fneg(ret);
        }
        return ret;
    }
                                
--------------
Best regards,
lh_mouse
2016-09-07


------------------------------------------------------------------------------
_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to