On Wed, May 31, 2023 at 3:51 PM Benjamin Root <[email protected]> wrote:
>
> I think it is the special values aspect that is most concerning. Math is just
> littered with all sorts of identities, especially with trig functions. While
> I know that floating point calculations are imprecise, there are certain
> properties of these functions that still held, such as going from -1 to 1.
>
> As a reference point on an M1 Mac using conda-forge:
> ```
> >>> import numpy as np
> >>> np.__version__
> '1.24.3'
> >>> np.sin(0.0)
> 0.0
> >>> np.cos(0.0)
> 1.0
> >>> np.sin(np.pi)
> 1.2246467991473532e-16
> >>> np.cos(np.pi)
> -1.0
> >>> np.sin(2*np.pi)
> -2.4492935982947064e-16
> >>> np.cos(2*np.pi)
> 1.0
> ```
>
> Not perfect, but still right in most places.
I would say these are all correct. The true value of sin(np.pi) *is*
1.2246467991473532e-16 (to 15 decimal places). Remember np.pi is not
π, but just a rational approximation to it.
You can see this with mpmath (note the use of np.pi, which is a fixed
float with 15 digits of precision):
>>> import mpmath
>>> mpmath.mp.dps = 100
>>> np.pi
3.141592653589793
>>> mpmath.sin(np.pi)
mpf('0.0000000000000001224646799147353177226065932274997997083053901299791949488257716260869609973258103775093255275690136556')
>>> float(_)
1.2246467991473532e-16
On the other hand, the "correct" floating-point value of sin(np.pi/2)
is exactly 1.0, because it rounds to 1 within 15 decimal places
>>> mpmath.sin(np.pi/2)
mpf('0.9999999999999999999999999999999981253002716726780066919054431429030069600365417183834514572043398874406')
>>> float(mpmath.sin(np.pi/2))
1.0
That's 32 9's after the decimal point.
(Things work out nicer at the 1s than the 0s because the derivatives
of the trig functions are zero there, meaning the Taylor series have a
cubic error term rather than a squared one)
Aaron Meurer
>
> I'm ambivalent about reverting. I know I would love speed improvements
> because transformation calculations in GIS is slow using numpy, but also some
> coordinate transformations might break because of these changes.
>
> Ben Root
>
>
> On Wed, May 31, 2023 at 11:40 AM Charles R Harris <[email protected]>
> wrote:
>>
>>
>>
>> On Wed, May 31, 2023 at 9:12 AM Robert Kern <[email protected]> wrote:
>>>
>>> On Wed, May 31, 2023 at 10:40 AM Ralf Gommers <[email protected]>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, May 31, 2023 at 4:19 PM Charles R Harris
>>>> <[email protected]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 31, 2023 at 8:05 AM Robert Kern <[email protected]> wrote:
>>>>>>
>>>>>> I would much, much rather have the special functions in the `np.*`
>>>>>> namespace be more accurate than fast on all platforms. These would not
>>>>>> have been on my list for general purpose speed optimization. How much
>>>>>> time is actually spent inside sin/cos even in a trig-heavy numpy
>>>>>> program? And most numpy programs aren't trig-heavy, but the precision
>>>>>> cost would be paid and noticeable even for those programs. I would want
>>>>>> fast-and-inaccurate functions to be strictly opt-in for those times that
>>>>>> they really paid off. Probably by providing them in their own module or
>>>>>> package rather than a runtime switch, because it's probably only a part
>>>>>> of my program that needs that kind of speed and can afford that
>>>>>> precision loss while there will be other parts that need the precision.
>>>>>>
>>>>>
>>>>> I think that would be a good policy going forward.
>>>>
>>>>
>>>> There's a little more to it than "precise and slow good", "fast == less
>>>> accurate == bad". We've touched on this when SVML got merged (e.g., [1])
>>>> and with other SIMD code, e.g. in the "Floating point precision
>>>> expectations in NumPy" thread [2]. Even libm doesn't guarantee the best
>>>> possible result of <0.5 ULP max error, and there are also considerations
>>>> like whether any numerical errors are normally distributed around the
>>>> exact mathematical answer or not (see, e.g., [3]).
>>>
>>>
>>> If we had a portable, low-maintenance, high-accuracy library that we could
>>> vendorize (and its performance cost wasn't horrid), I might even advocate
>>> that we do that. Reliance on platform libms is mostly about our maintenance
>>> burden than a principled accuracy/performance tradeoff. My preference is
>>> definitely firmly on the "precise and slow good" for these ufuncs because
>>> of the role these ufuncs play in real numpy programs; performance has a
>>> limited, situational effect while accuracy can have substantial ones across
>>> the board.
>>>
>>>> It seems fairly clear that with this recent change, the feeling is that
>>>> the tradeoff is bad and that too much accuracy was lost, for not enough
>>>> real-world gain. However, we now had several years worth of performance
>>>> work with few complaints about accuracy issues.
>>>
>>>
>>> Except that we get a flurry of complaints now that they actually affect
>>> popular platforms. I'm not sure I'd read much into a lack of complaints
>>> before that.
>>>
>>>>
>>>> So I wouldn't throw out the baby with the bath water now and say that we
>>>> always want the best accuracy only. It seems to me like we need a better
>>>> methodology for evaluating changes. Contributors have been pretty careful,
>>>> but looking back at SIMD PRs, there were usually detailed benchmarks but
>>>> not always detailed accuracy impact evaluations.
>>>
>>>
>>> I've only seen micro-benchmarks testing the runtime of individual
>>> functions, but maybe I haven't paid close enough attention. Have there been
>>> any benchmarks on real(ish) programs that demonstrate what utility these
>>> provide in even optimistic scenarios? I care precisely <1ULP about the
>>> absolute performance of `np.sin()` on its own. There are definitely
>>> programs that would care about that; I'm not sure any of them are (or
>>> should be) written in Python, though.
>>>
>>
>> One of my takeaways is that there are special values where more care should
>> be taken. Given the inherent inaccuracy of floating point computation, it
>> can be argued that there should be no such expectation, but here we are.
>> Some inaccuracies are more visible than others.
>>
>> I think it is less intrusive to have the option to lessen precision when
>> more speed is needed than the other way around. Our experience is that most
>> users are unsophisticated when it comes to floating point, we should
>> minimise the consequences for those users.
>>
>> Chuck
>> _______________________________________________
>> NumPy-Discussion mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: [email protected]
>
> _______________________________________________
> NumPy-Discussion mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: [email protected]
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]