Re: sin() implementation
On Fri, Sep 29, 2017 at 11:16:24AM +0200, Alexandre Ratchov wrote: > On Wed, Sep 27, 2017 at 03:09:48PM +0200, Joerg Sonnenberger wrote: > > On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > > > Even on modern amd64s integer arithmetics and bitwise operations are > > > faster (and more precise in many cases) than floating point > > > equivalents. > > > > Can you actually substanciate this claim? > > [OT] > > I'm working on typical signal processing code doing mostly > multiplications, additions and table lookups. For instance, when we do > the following with floats: > > float a, b, x, y; > y = a * x + b; > > we get 24 bits of precision as floats have 24-bit mantissa. If, with > proper typedefs and #ifdefery, we replace floats by ints representing > the same numbers as 1:4:27 fixed-points: > > int a, b, x, y; > y = ((long long)a * x >> 27) + b; > > we get 28 bits of precision, i.e. 4 extra bits of precision compared > to floats. > > I just did a quick test on a i5, gcc 4.2. The fixed-point version > consumes 10% to 50% less CPU depending on the algorithm. To test, I > just run the algorithm during 10 seconds and measure the throughput. You are comparing Apples and Oranges. Just for starters, it is quite common in newer (as in: less than 20 years old) calling conventions to pass the function arguments via registers. It is also quite common that moving data from a floating point register to a general purpose register requires going via the stack. On super scalar architectures this means that the basic input checks can often be done in parallel to the start of the range reduction etc. I'm not saying that libm should always use FP math. All I've been saying is that the code was written under certain assumptions and many of them haven't been valid for a long time. Joerg
Re: sin() implementation
On Fri, Sep 29, 2017 at 11:28:42AM +0200, Otto Moerbeek wrote: > On Fri, Sep 29, 2017 at 11:16:24AM +0200, Alexandre Ratchov wrote: > > > On Wed, Sep 27, 2017 at 03:09:48PM +0200, Joerg Sonnenberger wrote: > > > On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > > > > Even on modern amd64s integer arithmetics and bitwise operations are > > > > faster (and more precise in many cases) than floating point > > > > equivalents. > > > > > > Can you actually substanciate this claim? > > > > [OT] > > > > I'm working on typical signal processing code doing mostly > > multiplications, additions and table lookups. For instance, when we do > > the following with floats: > > > > float a, b, x, y; > > y = a * x + b; > > > > we get 24 bits of precision as floats have 24-bit mantissa. If, with > > proper typedefs and #ifdefery, we replace floats by ints representing > > the same numbers as 1:4:27 fixed-points: > > > > int a, b, x, y; > > y = ((long long)a * x >> 27) + b; > > > > we get 28 bits of precision, i.e. 4 extra bits of precision compared > > to floats. > > > > I just did a quick test on a i5, gcc 4.2. The fixed-point version > > consumes 10% to 50% less CPU depending on the algorithm. To test, I > > just run the algorithm during 10 seconds and measure the throughput. > > > > In the past, I've tested on other CPUs, with other compilers and the > > floating point version of this code was never faster than the > > fixed-point. > > Add to that this is platform indepedent code and that we run or at > least ran on platforms having softfloat, where the speed difference is > significant. > This was my initial motivation, zaurus had no fpu.
Re: sin() implementation
On Fri, Sep 29, 2017 at 11:16:24AM +0200, Alexandre Ratchov wrote: > On Wed, Sep 27, 2017 at 03:09:48PM +0200, Joerg Sonnenberger wrote: > > On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > > > Even on modern amd64s integer arithmetics and bitwise operations are > > > faster (and more precise in many cases) than floating point > > > equivalents. > > > > Can you actually substanciate this claim? > > [OT] > > I'm working on typical signal processing code doing mostly > multiplications, additions and table lookups. For instance, when we do > the following with floats: > > float a, b, x, y; > y = a * x + b; > > we get 24 bits of precision as floats have 24-bit mantissa. If, with > proper typedefs and #ifdefery, we replace floats by ints representing > the same numbers as 1:4:27 fixed-points: > > int a, b, x, y; > y = ((long long)a * x >> 27) + b; > > we get 28 bits of precision, i.e. 4 extra bits of precision compared > to floats. > > I just did a quick test on a i5, gcc 4.2. The fixed-point version > consumes 10% to 50% less CPU depending on the algorithm. To test, I > just run the algorithm during 10 seconds and measure the throughput. > > In the past, I've tested on other CPUs, with other compilers and the > floating point version of this code was never faster than the > fixed-point. Add to that this is platform indepedent code and that we run or at least ran on platforms having softfloat, where the speed difference is significant. -Otto
Re: sin() implementation
On Wed, Sep 27, 2017 at 03:09:48PM +0200, Joerg Sonnenberger wrote: > On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > > Even on modern amd64s integer arithmetics and bitwise operations are > > faster (and more precise in many cases) than floating point > > equivalents. > > Can you actually substanciate this claim? [OT] I'm working on typical signal processing code doing mostly multiplications, additions and table lookups. For instance, when we do the following with floats: float a, b, x, y; y = a * x + b; we get 24 bits of precision as floats have 24-bit mantissa. If, with proper typedefs and #ifdefery, we replace floats by ints representing the same numbers as 1:4:27 fixed-points: int a, b, x, y; y = ((long long)a * x >> 27) + b; we get 28 bits of precision, i.e. 4 extra bits of precision compared to floats. I just did a quick test on a i5, gcc 4.2. The fixed-point version consumes 10% to 50% less CPU depending on the algorithm. To test, I just run the algorithm during 10 seconds and measure the throughput. In the past, I've tested on other CPUs, with other compilers and the floating point version of this code was never faster than the fixed-point.
Re: sin() implementation
> On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > > Even on modern amd64s integer arithmetics and bitwise operations are > > faster (and more precise in many cases) than floating point > > equivalents. > > Can you actually substanciate this claim? The basic x87 instructions > (FLD, FST, FCOM etc) all run in one cycle with a latency of 2 according > to Agner. That's generally not worth the cost of moving from x87 to GP > registers. Precision might have been a problem two decades ago when many > compilers were too bad at correct parsing of float literals, but that > has been fixed in the mean time even without using C99 hex float > literals. The short answer is that the original msun code has been > optimized for a hardware environment that is substantially different > from modern CPUs. Many of the small details are no longer optimal. The integer arithmetics and bitwise operations are not slower than the float ones. Hence, since this is machine independent code, the approach taken is sane.
Re: sin() implementation
On Wed, Sep 27, 2017 at 08:40:26AM +0200, Alexandre Ratchov wrote: > Even on modern amd64s integer arithmetics and bitwise operations are > faster (and more precise in many cases) than floating point > equivalents. Can you actually substanciate this claim? The basic x87 instructions (FLD, FST, FCOM etc) all run in one cycle with a latency of 2 according to Agner. That's generally not worth the cost of moving from x87 to GP registers. Precision might have been a problem two decades ago when many compilers were too bad at correct parsing of float literals, but that has been fixed in the mean time even without using C99 hex float literals. The short answer is that the original msun code has been optimized for a hardware environment that is substantially different from modern CPUs. Many of the small details are no longer optimal. Joerg
Re: sin() implementation
On Tue, Sep 26, 2017 at 01:27:52PM +0200, Jan Stary wrote: > I picked sin() as an example while trying to walk though the implementation > of (pieces of) libm. If someone has the time for it, I have some questions. > > I understand the implementation originaly stems from Sun's libm of 1993. > (As does that of FreeBSD and NetBSD.) It has been tweaked over the years, > but the code that actually computes the values is mostly untouched. > > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > This is how |x| <= pi/4 is tested: > > GET_HIGH_WORD(ix,x); > ix &= 0x7fff; > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > Why is it done like that? Is it faster or more portable > or in any way better that comparing the value itself to M_PI_4? > (Or was it, in 1993?) Even on modern amd64s integer arithmetics and bitwise operations are faster (and more precise in many cases) than floating point equivalents. Recently I did few measurements on "real life" signal processing code (intel i5-2500K). I don't know if this is true for above condition and even if it was, I'm not sure the optimization would be visible in the execution time of programs using sinf() a lot. Hex constants would be "useful" to avoid rounding errors in decimal->binary conversions done during compilation. This could be avoided by using hex float literals, but these didn't exist in 1993. > For Inf and NaN arguments, NaN is returned as follows: > > /* sin(Inf or NaN) is NaN */ > else if (ix>=0x7ff0) return x-x; interestingly, 0x7f81 is also NaN but doesn't pass the test. AFAICS, to implement what the comment says, the test would be: else if (ix >= 0x7f80) return x - x; > > Again, why the integer conversion? Would > > else if (isinf(x) || isnan(x)) > > be slower? Less portable? > > Also, is "return x-x" just a fast/clever way to return NaN > for x being Inf _or_ NaN, preferable to return nan("")? > Above function could also just set x to 0x7fff. For sure nan(3) and isnan(3) were not common in 1993, so this might be for portability. Furthermore, we used to support non-IEEE-754 machines which may need this, I don't known.
Re: sin() implementation
On Tue, Sep 26, 2017 at 6:27 AM, Jan Starywrote: > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > This is how |x| <= pi/4 is tested: > GET_HIGH_WORD(ix,x); > ix &= 0x7fff; > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > Why is it done like that? Is it faster or more portable > or in any way better that comparing the value itself to M_PI_4? > See s_cos.c s_tan.c; and many others start with a similar block. Probably the integer comparison is efficient (fewer average CPU and bus resources per second tied up per operation). It will depend upon your computer's characteristics, and applications running, and how many times per second sin() is invoked on average. Assuming the compiler optimization of the above code block is as good as it could be: that should hinge on whether a single integer copy, and 32-bit integer subtraction and Bitwise AND on your CPU will burn more processor work cycles than two floating-point copy, two floating-point subtraction plus one sign negate operation against IEEE 64-bit double. The basic rule of thumb says that if a performance-significant task can be completed using Integers instead, then maximum effort should be made to avoid/minimize unnecessary usage of floating-point operations (without an excessive increase in the # of operations), As 3 to 4 times the processor work cycles for execution path to 3-argument 64-bit floating point calculation Versus the corresponding 3-argument operation using 32-bit integers. More floating point operations/sec = lower system efficiency, or less-effective use of the overall available compute capacity Versus the use of processors' regular, typically more-optimized ALU pipelines. > if (-M_PI_4 < x && x < M_PI_4) > >For Inf and NaN arguments, NaN is returned as follows: > >/* sin(Inf or NaN) is NaN */ > >else if (ix>=0x7ff0) return x-x; > > > Again, why the integer conversion? Would > > > >else if (isinf(x) || isnan(x)) > Also, is "return x-x" just a fast/clever way to return NaN > Well, the original code is applying a >= test,whereas isinf() and isnan() are shown to be equality tests that check for two specific values of the exponent field. The || branching is involving more computations though General application code should use the new isinf(), isnan(), and nan("") macros, but they have not always been part of the C standard, and I imagine this code predates it. The NaN/InF values are runtime-specific in the current C standard, so it is only sort of legitimate to do that, Because this code is part of the libc runtime's implementation itself. In this context, the libc implementation has enough information to safely rely on the assumption to make the decision without resorting to a macro call that invokes a number of other operations, and so Is free and perfectly fine to use that ix >= 0x7ff0 to equate to a NaN condition. That appears to potentially streamline the number of computations required for this case, so there might actually be a performance benefit to not use the macros here as well. -- -JH
Re: sin() implementation
On Tue, Sep 26, 2017 at 01:27:52PM +0200, Jan Stary wrote: > I picked sin() as an example while trying to walk though the implementation > of (pieces of) libm. If someone has the time for it, I have some questions. > > I understand the implementation originaly stems from Sun's libm of 1993. > (As does that of FreeBSD and NetBSD.) It has been tweaked over the years, > but the code that actually computes the values is mostly untouched. > > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > This is how |x| <= pi/4 is tested: > > GET_HIGH_WORD(ix,x); > ix &= 0x7fff; > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > Why is it done like that? Is it faster or more portable > or in any way better that comparing the value itself to M_PI_4? > (Or was it, in 1993?) The code is written in a way to use as little as fp ops as possible. > > For Inf and NaN arguments, NaN is returned as follows: > > /* sin(Inf or NaN) is NaN */ > else if (ix>=0x7ff0) return x-x; > > Again, why the integer conversion? Would > > else if (isinf(x) || isnan(x)) > > be slower? Less portable? These functions were not generally available when the code was written. iirc these were only standarized in C99. > > Also, is "return x-x" just a fast/clever way to return NaN > for x being Inf _or_ NaN, preferable to return nan("")? Same as above. -Otto > > I have other questions in k_sin.c and elsewhere, but I'll stop here. > Please see my naive diff below. I tried this on amd64, macppc and armv7; > regress/lib/libm/fpaccuracy results in zero difference. The only > other thing I tried is "sox -n out.raw synth 10 sin" which also gives > the same result. (That's hardly "testing" of course.) > > Surely I am missing something elementary. > I'll be glad for any insight. > > Thank you for your time > > Jan > > > Index: s_sin.c > === > RCS file: /cvs/src/lib/libm/src/s_sin.c,v > retrieving revision 1.11 > diff -u -p -r1.11 s_sin.c > --- s_sin.c 12 Sep 2016 19:47:02 - 1.11 > +++ s_sin.c 26 Sep 2017 08:54:03 - > @@ -50,17 +50,12 @@ double > sin(double x) > { > double y[2],z=0.0; > - int32_t n, ix; > + int32_t n; > > -/* High word of x. */ > - GET_HIGH_WORD(ix,x); > - > -/* |x| ~< pi/4 */ > - ix &= 0x7fff; > - if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > - > -/* sin(Inf or NaN) is NaN */ > - else if (ix>=0x7ff0) return x-x; > + if (-M_PI_4 < x && x < M_PI_4) > + return __kernel_sin(x,z,0); > + else if (isinf(x) || isnan(x)) > + return nan(""); > > /* argument reduction needed */ > else {
Re: sin() implementation
On Tue, Sep 26, 2017 at 06:21:16PM +0200, Jan Stary wrote: > These (diff below) seem to be obvious typos in k_sin.c, > but only in comments. The comment that says > > if x < 2^-27 (hx<0x3e40), return x with inexact if x!=0 > > also puzzles me a bit: what the code does is > > GET_HIGH_WORD(ix,x); > ix &= 0x7fff; /* high word of x */ > if(ix<0x3e40) /* |x| < 2**-27 */ > {if((int)x==0) return x;} /* generate inexact */ > > If the high word of a double x is less than 0x3e40, > how could (int)x be anything else than zero? The result might be zero, but you want the inexact flag to be set in those cases. I think case 2 describes it correctly: only return 0 with the inexact flag not set if x equals zero. -Otto > > The "if x!= 0" also puzzles me: the sine of zero _is_ zero, exactly. > What would change with this: > > if (ix < 0x3e40) > return x; > > What is the role of the iy indicator? > Is it easier/faster to test the integer indicator for iy == 0 > than it would be to test y == 0 itself? Or is there another reason? > > Jan > > > Index: k_sin.c > === > RCS file: /cvs/src/lib/libm/src/k_sin.c,v > retrieving revision 1.3 > diff -u -p -r1.3 k_sin.c > --- k_sin.c 27 Oct 2009 23:59:30 - 1.3 > +++ k_sin.c 26 Sep 2017 16:07:45 - > @@ -10,15 +10,15 @@ > * > */ > > -/* __kernel_sin( x, y, iy) > +/* __kernel_sin(x, y, iy) > * kernel sin function on [-pi/4, pi/4], pi/4 ~ 0.7854 > * Input x is assumed to be bounded by ~pi/4 in magnitude. > * Input y is the tail of x. > - * Input iy indicates whether y is 0. (if iy=0, y assume to be 0). > + * Input iy indicates whether y is 0. (if iy=0, assume y to be 0). > * > * Algorithm > * 1. Since sin(-x) = -sin(x), we need only to consider positive x. > - * 2. if x < 2^-27 (hx<0x3e40 0), return x with inexact if x!=0. > + * 2. if x < 2^-27 (hx<0x3e40), return x with inexact if x!=0. > * 3. sin(x) is approximated by a polynomial of degree 13 on > * [0,pi/4] > *313
Re: sin() implementation
On Sep 26 17:41:17, h...@stare.cz wrote: > > > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > > > This is how |x| <= pi/4 is tested: > > > > > > GET_HIGH_WORD(ix,x); > > > ix &= 0x7fff; > > > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > > > > > Why is it done like that? Is it faster or more portable > > > or in any way better that comparing the value itself to M_PI_4? > > > (Or was it, in 1993?) > > > > Hm, is 0x3fe921fb the nearest float <= pi/4? > > s_sinf.c uses 0x3f490fd8 for that test. > > Yes. > > #include > #include > #include "math_private.h" > > int > main() > { > double d = M_PI_4; > double f = M_PI_4; float f, sorry. > int32_t i, j; > > GET_HIGH_WORD(i, d); > GET_FLOAT_WORD(j, f); > > printf("double %f, high word %#x\n", d, i); > printf("float %f, float word %#x\n", f, j); > > return 0; > } > > This says > > double 0.785398, high word 0x3fe921fb > float 0.785398, float word 0x3f490fdb > > In case of double, it's exactly 0x3fe921fb. > But why then does s_sinf.c use 0x3f490fd8 instead of 0x3f490fdb? > > Jan >
Re: sin() implementation
These (diff below) seem to be obvious typos in k_sin.c, but only in comments. The comment that says if x < 2^-27 (hx<0x3e40), return x with inexact if x!=0 also puzzles me a bit: what the code does is GET_HIGH_WORD(ix,x); ix &= 0x7fff; /* high word of x */ if(ix<0x3e40) /* |x| < 2**-27 */ {if((int)x==0) return x;} /* generate inexact */ If the high word of a double x is less than 0x3e40, how could (int)x be anything else than zero? The "if x!= 0" also puzzles me: the sine of zero _is_ zero, exactly. What would change with this: if (ix < 0x3e40) return x; What is the role of the iy indicator? Is it easier/faster to test the integer indicator for iy == 0 than it would be to test y == 0 itself? Or is there another reason? Jan Index: k_sin.c === RCS file: /cvs/src/lib/libm/src/k_sin.c,v retrieving revision 1.3 diff -u -p -r1.3 k_sin.c --- k_sin.c 27 Oct 2009 23:59:30 - 1.3 +++ k_sin.c 26 Sep 2017 16:07:45 - @@ -10,15 +10,15 @@ * */ -/* __kernel_sin( x, y, iy) +/* __kernel_sin(x, y, iy) * kernel sin function on [-pi/4, pi/4], pi/4 ~ 0.7854 * Input x is assumed to be bounded by ~pi/4 in magnitude. * Input y is the tail of x. - * Input iy indicates whether y is 0. (if iy=0, y assume to be 0). + * Input iy indicates whether y is 0. (if iy=0, assume y to be 0). * * Algorithm * 1. Since sin(-x) = -sin(x), we need only to consider positive x. - * 2. if x < 2^-27 (hx<0x3e40 0), return x with inexact if x!=0. + * 2. if x < 2^-27 (hx<0x3e40), return x with inexact if x!=0. * 3. sin(x) is approximated by a polynomial of degree 13 on *[0,pi/4] * 313
Re: sin() implementation
> > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > > This is how |x| <= pi/4 is tested: > > > > GET_HIGH_WORD(ix,x); > > ix &= 0x7fff; > > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > > > Why is it done like that? Is it faster or more portable > > or in any way better that comparing the value itself to M_PI_4? > > (Or was it, in 1993?) > > Hm, is 0x3fe921fb the nearest float <= pi/4? > s_sinf.c uses 0x3f490fd8 for that test. Yes. #include #include #include "math_private.h" int main() { double d = M_PI_4; double f = M_PI_4; int32_t i, j; GET_HIGH_WORD(i, d); GET_FLOAT_WORD(j, f); printf("double %f, high word %#x\n", d, i); printf("float %f, float word %#x\n", f, j); return 0; } This says double 0.785398, high word 0x3fe921fb float 0.785398, float word 0x3f490fdb In case of double, it's exactly 0x3fe921fb. But why then does s_sinf.c use 0x3f490fd8 instead of 0x3f490fdb? Jan
Re: sin() implementation
On Sep 26 13:27:52, h...@stare.cz wrote: > I picked sin() as an example while trying to walk though the implementation > of (pieces of) libm. If someone has the time for it, I have some questions. > > I understand the implementation originaly stems from Sun's libm of 1993. > (As does that of FreeBSD and NetBSD.) And Android https://android.googlesource.com/platform/external/fdlibm/ and Julia http://openlibm.org/, apparently. > It has been tweaked over the years, > but the code that actually computes the values is mostly untouched. > > s_sin.c normalizes the argument to [-pi/4, +pi/4]. > This is how |x| <= pi/4 is tested: > > GET_HIGH_WORD(ix,x); > ix &= 0x7fff; > if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > > Why is it done like that? Is it faster or more portable > or in any way better that comparing the value itself to M_PI_4? > (Or was it, in 1993?) Hm, is 0x3fe921fb the nearest float <= pi/4? s_sinf.c uses 0x3f490fd8 for that test. Jan > For Inf and NaN arguments, NaN is returned as follows: > > /* sin(Inf or NaN) is NaN */ > else if (ix>=0x7ff0) return x-x; > > Again, why the integer conversion? Would > > else if (isinf(x) || isnan(x)) > > be slower? Less portable? > > Also, is "return x-x" just a fast/clever way to return NaN > for x being Inf _or_ NaN, preferable to return nan("")? > > I have other questions in k_sin.c and elsewhere, but I'll stop here. > Please see my naive diff below. I tried this on amd64, macppc and armv7; > regress/lib/libm/fpaccuracy results in zero difference. The only > other thing I tried is "sox -n out.raw synth 10 sin" which also gives > the same result. (That's hardly "testing" of course.) > > Surely I am missing something elementary. > I'll be glad for any insight. > > Thank you for your time > > Jan > > > Index: s_sin.c > === > RCS file: /cvs/src/lib/libm/src/s_sin.c,v > retrieving revision 1.11 > diff -u -p -r1.11 s_sin.c > --- s_sin.c 12 Sep 2016 19:47:02 - 1.11 > +++ s_sin.c 26 Sep 2017 08:54:03 - > @@ -50,17 +50,12 @@ double > sin(double x) > { > double y[2],z=0.0; > - int32_t n, ix; > + int32_t n; > > -/* High word of x. */ > - GET_HIGH_WORD(ix,x); > - > -/* |x| ~< pi/4 */ > - ix &= 0x7fff; > - if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); > - > -/* sin(Inf or NaN) is NaN */ > - else if (ix>=0x7ff0) return x-x; > + if (-M_PI_4 < x && x < M_PI_4) > + return __kernel_sin(x,z,0); > + else if (isinf(x) || isnan(x)) > + return nan(""); > > /* argument reduction needed */ > else { >
sin() implementation
I picked sin() as an example while trying to walk though the implementation of (pieces of) libm. If someone has the time for it, I have some questions. I understand the implementation originaly stems from Sun's libm of 1993. (As does that of FreeBSD and NetBSD.) It has been tweaked over the years, but the code that actually computes the values is mostly untouched. s_sin.c normalizes the argument to [-pi/4, +pi/4]. This is how |x| <= pi/4 is tested: GET_HIGH_WORD(ix,x); ix &= 0x7fff; if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); Why is it done like that? Is it faster or more portable or in any way better that comparing the value itself to M_PI_4? (Or was it, in 1993?) For Inf and NaN arguments, NaN is returned as follows: /* sin(Inf or NaN) is NaN */ else if (ix>=0x7ff0) return x-x; Again, why the integer conversion? Would else if (isinf(x) || isnan(x)) be slower? Less portable? Also, is "return x-x" just a fast/clever way to return NaN for x being Inf _or_ NaN, preferable to return nan("")? I have other questions in k_sin.c and elsewhere, but I'll stop here. Please see my naive diff below. I tried this on amd64, macppc and armv7; regress/lib/libm/fpaccuracy results in zero difference. The only other thing I tried is "sox -n out.raw synth 10 sin" which also gives the same result. (That's hardly "testing" of course.) Surely I am missing something elementary. I'll be glad for any insight. Thank you for your time Jan Index: s_sin.c === RCS file: /cvs/src/lib/libm/src/s_sin.c,v retrieving revision 1.11 diff -u -p -r1.11 s_sin.c --- s_sin.c 12 Sep 2016 19:47:02 - 1.11 +++ s_sin.c 26 Sep 2017 08:54:03 - @@ -50,17 +50,12 @@ double sin(double x) { double y[2],z=0.0; - int32_t n, ix; + int32_t n; -/* High word of x. */ - GET_HIGH_WORD(ix,x); - -/* |x| ~< pi/4 */ - ix &= 0x7fff; - if(ix <= 0x3fe921fb) return __kernel_sin(x,z,0); - -/* sin(Inf or NaN) is NaN */ - else if (ix>=0x7ff0) return x-x; + if (-M_PI_4 < x && x < M_PI_4) + return __kernel_sin(x,z,0); + else if (isinf(x) || isnan(x)) + return nan(""); /* argument reduction needed */ else {