On Wed, May 31, 2023 at 3:51 PM Benjamin Root <ben.v.r...@gmail.com> wrote: > > I think it is the special values aspect that is most concerning. Math is just > littered with all sorts of identities, especially with trig functions. While > I know that floating point calculations are imprecise, there are certain > properties of these functions that still held, such as going from -1 to 1. > > As a reference point on an M1 Mac using conda-forge: > ``` > >>> import numpy as np > >>> np.__version__ > '1.24.3' > >>> np.sin(0.0) > 0.0 > >>> np.cos(0.0) > 1.0 > >>> np.sin(np.pi) > 1.2246467991473532e-16 > >>> np.cos(np.pi) > -1.0 > >>> np.sin(2*np.pi) > -2.4492935982947064e-16 > >>> np.cos(2*np.pi) > 1.0 > ``` > > Not perfect, but still right in most places.
I would say these are all correct. The true value of sin(np.pi) *is* 1.2246467991473532e-16 (to 15 decimal places). Remember np.pi is not π, but just a rational approximation to it. You can see this with mpmath (note the use of np.pi, which is a fixed float with 15 digits of precision): >>> import mpmath >>> mpmath.mp.dps = 100 >>> np.pi 3.141592653589793 >>> mpmath.sin(np.pi) mpf('0.0000000000000001224646799147353177226065932274997997083053901299791949488257716260869609973258103775093255275690136556') >>> float(_) 1.2246467991473532e-16 On the other hand, the "correct" floating-point value of sin(np.pi/2) is exactly 1.0, because it rounds to 1 within 15 decimal places >>> mpmath.sin(np.pi/2) mpf('0.9999999999999999999999999999999981253002716726780066919054431429030069600365417183834514572043398874406') >>> float(mpmath.sin(np.pi/2)) 1.0 That's 32 9's after the decimal point. (Things work out nicer at the 1s than the 0s because the derivatives of the trig functions are zero there, meaning the Taylor series have a cubic error term rather than a squared one) Aaron Meurer > > I'm ambivalent about reverting. I know I would love speed improvements > because transformation calculations in GIS is slow using numpy, but also some > coordinate transformations might break because of these changes. > > Ben Root > > > On Wed, May 31, 2023 at 11:40 AM Charles R Harris <charlesr.har...@gmail.com> > wrote: >> >> >> >> On Wed, May 31, 2023 at 9:12 AM Robert Kern <robert.k...@gmail.com> wrote: >>> >>> On Wed, May 31, 2023 at 10:40 AM Ralf Gommers <ralf.gomm...@gmail.com> >>> wrote: >>>> >>>> >>>> >>>> On Wed, May 31, 2023 at 4:19 PM Charles R Harris >>>> <charlesr.har...@gmail.com> wrote: >>>>> >>>>> >>>>> >>>>> On Wed, May 31, 2023 at 8:05 AM Robert Kern <robert.k...@gmail.com> wrote: >>>>>> >>>>>> I would much, much rather have the special functions in the `np.*` >>>>>> namespace be more accurate than fast on all platforms. These would not >>>>>> have been on my list for general purpose speed optimization. How much >>>>>> time is actually spent inside sin/cos even in a trig-heavy numpy >>>>>> program? And most numpy programs aren't trig-heavy, but the precision >>>>>> cost would be paid and noticeable even for those programs. I would want >>>>>> fast-and-inaccurate functions to be strictly opt-in for those times that >>>>>> they really paid off. Probably by providing them in their own module or >>>>>> package rather than a runtime switch, because it's probably only a part >>>>>> of my program that needs that kind of speed and can afford that >>>>>> precision loss while there will be other parts that need the precision. >>>>>> >>>>> >>>>> I think that would be a good policy going forward. >>>> >>>> >>>> There's a little more to it than "precise and slow good", "fast == less >>>> accurate == bad". We've touched on this when SVML got merged (e.g., [1]) >>>> and with other SIMD code, e.g. in the "Floating point precision >>>> expectations in NumPy" thread [2]. Even libm doesn't guarantee the best >>>> possible result of <0.5 ULP max error, and there are also considerations >>>> like whether any numerical errors are normally distributed around the >>>> exact mathematical answer or not (see, e.g., [3]). >>> >>> >>> If we had a portable, low-maintenance, high-accuracy library that we could >>> vendorize (and its performance cost wasn't horrid), I might even advocate >>> that we do that. Reliance on platform libms is mostly about our maintenance >>> burden than a principled accuracy/performance tradeoff. My preference is >>> definitely firmly on the "precise and slow good" for these ufuncs because >>> of the role these ufuncs play in real numpy programs; performance has a >>> limited, situational effect while accuracy can have substantial ones across >>> the board. >>> >>>> It seems fairly clear that with this recent change, the feeling is that >>>> the tradeoff is bad and that too much accuracy was lost, for not enough >>>> real-world gain. However, we now had several years worth of performance >>>> work with few complaints about accuracy issues. >>> >>> >>> Except that we get a flurry of complaints now that they actually affect >>> popular platforms. I'm not sure I'd read much into a lack of complaints >>> before that. >>> >>>> >>>> So I wouldn't throw out the baby with the bath water now and say that we >>>> always want the best accuracy only. It seems to me like we need a better >>>> methodology for evaluating changes. Contributors have been pretty careful, >>>> but looking back at SIMD PRs, there were usually detailed benchmarks but >>>> not always detailed accuracy impact evaluations. >>> >>> >>> I've only seen micro-benchmarks testing the runtime of individual >>> functions, but maybe I haven't paid close enough attention. Have there been >>> any benchmarks on real(ish) programs that demonstrate what utility these >>> provide in even optimistic scenarios? I care precisely <1ULP about the >>> absolute performance of `np.sin()` on its own. There are definitely >>> programs that would care about that; I'm not sure any of them are (or >>> should be) written in Python, though. >>> >> >> One of my takeaways is that there are special values where more care should >> be taken. Given the inherent inaccuracy of floating point computation, it >> can be argued that there should be no such expectation, but here we are. >> Some inaccuracies are more visible than others. >> >> I think it is less intrusive to have the option to lessen precision when >> more speed is needed than the other way around. Our experience is that most >> users are unsophisticated when it comes to floating point, we should >> minimise the consequences for those users. >> >> Chuck >> _______________________________________________ >> NumPy-Discussion mailing list -- numpy-discussion@python.org >> To unsubscribe send an email to numpy-discussion-le...@python.org >> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ >> Member address: ben.v.r...@gmail.com > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: asmeu...@gmail.com _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com