Re: [Python-ideas] Fwd: NAN handling in the statistics module

2019-01-10 Thread Jonathan Fine
On Thu, Jan 10, 2019 at 5:07 PM David Mertz  wrote:

>>> You might shoot yourself in the foot, but at least you know its the same 
>>> foot you shot yourself in using the previous version *wink*

> I've lost attribution chain. I think this is Steven, but it doesn't really 
> matter.

I think it was Steve. So far as I know, he's the only person on this
list who winks at other participants.

-- 
Jonathan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Fwd: NAN handling in the statistics module

2019-01-10 Thread David Mertz
>
> One possible argument for making PASS the default, even if that means
>> implementation-dependent behaviour with NANs, is that in the absense of a
>> clear preference for FAIL or RETURN, at least PASS is backwards compatible.
>>
>> You might shoot yourself in the foot, but at least you know its the same
>> foot you shot yourself in using the previous version *wink*
>>
>
I've lost attribution chain. I think this is Steven, but it doesn't really
matter.

This statement is untrue, or at least only accidentally true at most. The
behavior of sorted() against partially ordered collections is unspecified.
The author of Timsort says exactly this.

If stastics.median() keeps the same implementation—or keeps it with a PASS
argument—it may or may not produce the same result in a later Python
versions. Timsort is great, but even that has been tweaked sightly over
time.

I guess the statement is true if "same foot" means "meaningless answer" not
some specific value. But that hardly feels like a defense of the behavior.

>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] NAN handling in the statistics module

2019-01-10 Thread Neil Girdhar


On Monday, January 7, 2019 at 3:16:07 AM UTC-5, Steven D'Aprano wrote:
>
> (By the way, I'm not outright disagreeing with you, I'm trying to weigh 
> up the pros and cons of your position. You've given me a lot to think 
> about. More below.) 
>
> On Sun, Jan 06, 2019 at 11:31:30PM -0800, Nathaniel Smith wrote: 
> > On Sun, Jan 6, 2019 at 11:06 PM Steven D'Aprano  > wrote: 
> > > I'm not wedded to the idea that the default ought to be the current 
> > > behaviour. If there is a strong argument for one of the others, I'm 
> > > listening. 
> > 
> > "Errors should never pass silently"? Silently returning nonsensical 
> > results is hard to defend as a default behavior IMO :-) 
>
> If you violate the assumptions of the function, just about everything 
> can in principle return nonsensical results. True, most of the time you 
> have to work hard at it: 
>
> class MyList(list): 
> def __len__(self): 
> return random.randint(0, sys.maxint) 
>
> but it isn't unreasonable to document the assumptions of a function, and 
> if the caller violates those assumptions, Garbage In Garbage Out 
> applies. 
>

I'm with Antoine, Nathaniel, David, and Chris: it is unreasonable to 
silently return nonsensical results even if you've documented it.  
Documenting it only makes it worse because it's like an "I told you so" 
when people finally figure out what's wrong and go to file the bug.
 

>
> E.g. bisect requires that your list is sorted in ascending order. If it 
> isn't, the results you get are nonsensical. 
>
> py> data = [8, 6, 4, 2, 0] 
> py> bisect.bisect(data, 1) 
> 0 
>
> That's not a bug in bisect, that's a bug in the caller's code, and it 
> isn't bisect's responsibility to fix it. 
>
> Although it could be documented better, that's the current situation 
> with NANs and median(). Data with NANs don't have a total ordering, and 
> total ordering is the unstated assumption behind the idea of a median or 
> middle value. So all bets are off. 
>
>   
> > > How would you answer those who say that the right behaviour is not to 
> > > propogate unwanted NANs, but to fail fast and raise an exception? 
> > 
> > Both seem defensible a priori, but every other mathematical operation 
> > in Python propagates NaNs instead of raising an exception. Is there 
> > something unusual about median that would justify giving it unusual 
> > behavior? 
>
> Well, not everything... 
>
> py> NAN/0 
> Traceback (most recent call last): 
>   File "", line 1, in  
> ZeroDivisionError: float division by zero 
>
>
> There may be others. But I'm not sure that "everything else does it" is 
> a strong justification. It is *a* justification, since consistency is 
> good, but consistency does not necessarily outweigh other concerns. 
>
> One possible argument for making PASS the default, even if that means 
> implementation-dependent behaviour with NANs, is that in the absense of 
> a clear preference for FAIL or RETURN, at least PASS is backwards 
> compatible. 
>
> You might shoot yourself in the foot, but at least you know its the same 
> foot you shot yourself in using the previous version *wink* 
>
>
>
> -- 
> Steve 
> ___ 
> Python-ideas mailing list 
> python...@python.org  
> https://mail.python.org/mailman/listinfo/python-ideas 
> Code of Conduct: http://python.org/psf/codeofconduct/ 
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/