On Wed, May 31, 2023 at 3:51 PM Benjamin Root <ben.v.r...@gmail.com> wrote:
>
> I think it is the special values aspect that is most concerning. Math is just 
> littered with all sorts of identities, especially with trig functions. While 
> I know that floating point calculations are imprecise, there are certain 
> properties of these functions that still held, such as going from -1 to 1.
>
> As a reference point on an M1 Mac using conda-forge:
> ```
> >>> import numpy as np
> >>> np.__version__
> '1.24.3'
> >>> np.sin(0.0)
> 0.0
> >>> np.cos(0.0)
> 1.0
> >>> np.sin(np.pi)
> 1.2246467991473532e-16
> >>> np.cos(np.pi)
> -1.0
> >>> np.sin(2*np.pi)
> -2.4492935982947064e-16
> >>> np.cos(2*np.pi)
> 1.0
> ```
>
> Not perfect, but still right in most places.

I would say these are all correct. The true value of sin(np.pi) *is*
1.2246467991473532e-16 (to 15 decimal places). Remember np.pi is not
π, but just a rational approximation to it.

You can see this with mpmath (note the use of np.pi, which is a fixed
float with 15 digits of precision):

>>> import mpmath
>>> mpmath.mp.dps = 100
>>> np.pi
3.141592653589793
>>> mpmath.sin(np.pi)
mpf('0.0000000000000001224646799147353177226065932274997997083053901299791949488257716260869609973258103775093255275690136556')
>>> float(_)
1.2246467991473532e-16

On the other hand, the "correct" floating-point value of sin(np.pi/2)
is exactly 1.0, because it rounds to 1 within 15 decimal places

>>> mpmath.sin(np.pi/2)
mpf('0.9999999999999999999999999999999981253002716726780066919054431429030069600365417183834514572043398874406')
>>> float(mpmath.sin(np.pi/2))
1.0

That's 32 9's after the decimal point.

(Things work out nicer at the 1s than the 0s because the derivatives
of the trig functions are zero there, meaning the Taylor series have a
cubic error term rather than a squared one)

Aaron Meurer

>
> I'm ambivalent about reverting. I know I would love speed improvements 
> because transformation calculations in GIS is slow using numpy, but also some 
> coordinate transformations might break because of these changes.
>
> Ben Root
>
>
> On Wed, May 31, 2023 at 11:40 AM Charles R Harris <charlesr.har...@gmail.com> 
> wrote:
>>
>>
>>
>> On Wed, May 31, 2023 at 9:12 AM Robert Kern <robert.k...@gmail.com> wrote:
>>>
>>> On Wed, May 31, 2023 at 10:40 AM Ralf Gommers <ralf.gomm...@gmail.com> 
>>> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, May 31, 2023 at 4:19 PM Charles R Harris 
>>>> <charlesr.har...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 31, 2023 at 8:05 AM Robert Kern <robert.k...@gmail.com> wrote:
>>>>>>
>>>>>> I would much, much rather have the special functions in the `np.*` 
>>>>>> namespace be more accurate than fast on all platforms. These would not 
>>>>>> have been on my list for general purpose speed optimization. How much 
>>>>>> time is actually spent inside sin/cos even in a trig-heavy numpy 
>>>>>> program? And most numpy programs aren't trig-heavy, but the precision 
>>>>>> cost would be paid and noticeable even for those programs. I would want 
>>>>>> fast-and-inaccurate functions to be strictly opt-in for those times that 
>>>>>> they really paid off. Probably by providing them in their own module or 
>>>>>> package rather than a runtime switch, because it's probably only a part 
>>>>>> of my program that needs that kind of speed and can afford that 
>>>>>> precision loss while there will be other parts that need the precision.
>>>>>>
>>>>>
>>>>> I think that would be a good policy going forward.
>>>>
>>>>
>>>> There's a little more to it than "precise and slow good", "fast == less 
>>>> accurate == bad". We've touched on this when SVML got merged (e.g., [1]) 
>>>> and with other SIMD code, e.g. in the "Floating point precision 
>>>> expectations in NumPy" thread [2]. Even libm doesn't guarantee the best 
>>>> possible result of <0.5 ULP max error, and there are also considerations 
>>>> like whether any numerical errors are normally distributed around the 
>>>> exact mathematical answer or not (see, e.g., [3]).
>>>
>>>
>>> If we had a portable, low-maintenance, high-accuracy library that we could 
>>> vendorize (and its performance cost wasn't horrid), I might even advocate 
>>> that we do that. Reliance on platform libms is mostly about our maintenance 
>>> burden than a principled accuracy/performance tradeoff. My preference is 
>>> definitely firmly on the "precise and slow good" for these ufuncs because 
>>> of the role these ufuncs play in real numpy programs; performance has a 
>>> limited, situational effect while accuracy can have substantial ones across 
>>> the board.
>>>
>>>> It seems fairly clear that with this recent change, the feeling is that 
>>>> the tradeoff is bad and that too much accuracy was lost, for not enough 
>>>> real-world gain. However, we now had several years worth of performance 
>>>> work with few complaints about accuracy issues.
>>>
>>>
>>> Except that we get a flurry of complaints now that they actually affect 
>>> popular platforms. I'm not sure I'd read much into a lack of complaints 
>>> before that.
>>>
>>>>
>>>> So I wouldn't throw out the baby with the bath water now and say that we 
>>>> always want the best accuracy only. It seems to me like we need a better 
>>>> methodology for evaluating changes. Contributors have been pretty careful, 
>>>> but looking back at SIMD PRs, there were usually detailed benchmarks but 
>>>> not always detailed accuracy impact evaluations.
>>>
>>>
>>> I've only seen micro-benchmarks testing the runtime of individual 
>>> functions, but maybe I haven't paid close enough attention. Have there been 
>>> any benchmarks on real(ish) programs that demonstrate what utility these 
>>> provide in even optimistic scenarios? I care precisely <1ULP about the 
>>> absolute performance of `np.sin()` on its own. There are definitely 
>>> programs that would care about that; I'm not sure any of them are (or 
>>> should be) written in Python, though.
>>>
>>
>> One of my takeaways is that there are special values where more care should 
>> be taken. Given the inherent inaccuracy of floating point computation, it 
>> can be argued that there should be no such expectation, but here we are. 
>> Some inaccuracies are more visible than others.
>>
>> I think it is less intrusive to have the option to lessen precision when 
>> more speed is needed than the other way around. Our experience is that most 
>> users are unsophisticated when it comes to floating point, we should 
>> minimise the consequences for those users.
>>
>> Chuck
>> _______________________________________________
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ben.v.r...@gmail.com
>
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: asmeu...@gmail.com
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to