On Thu, Dec 26, 2019 at 02:23:42PM -0800, Andrew Barnert via Python-ideas wrote:

> I don’t think that’s true. Surely the median of (-inf, 1, 2, 3, inf, 
> inf, inf) is well defined and can only be 3?

It's well-defined, but probably not good statistics. I'm not sure what 
measurement you are making that gives you infinities, they would surely 
be outliers and suspect data :-)

But for what it's worth, median() is fine in the presence of infinities:

    py> statistics.median([INF, 3, INF, -INF, 1, INF, 2])
    3

I can't take credit for this. Thanks to IEEE-754 semantics, median() 
ought to do the right thing for any number or combination of positive or 
negative infinities and finite numbers. Including the case Andrew 
mentioned where your data consists of all infinities, exactly half of 
which are positive and half negative:

    py> statistics.median([INF, -INF])
    nan

which is correct IEEE-754 semantics for ∞ - ∞.

> > The number could have been 1e1000 - 1e999 (and thus should be big) 
> > or 1e999 - 1e1000 (and thus should be very negative) or 1e1000 - 
> > 1e1000 (and thus should be zero), which is why we get a NaN here.
> 
> Well, here both numbers are clearly 1e1000, and the right answer is 0. 

median() does the right thing with Decimals.

    py> statistics.median([Decimal("-1e1000"), Decimal("1e1000")])
    Decimal('0E+1000')


> The problem is that (in systems where float is IEEE double) that 
> number can’t be represented as a float in the first place, so Python 
> approximates it with inf, so you (inaccurately, but predictably and 
> understandably) get nan instead of 0. It’s like a very extreme case of 
> “float rounding error”.

More like float overflow. 1e1000 overflows to infinity, and the rest 
follows from that.


[...]
> This amounts to an argument that in ‘newbie’ mode there should be no 
> inf or nan values in float in the first place, and anything that 
> returns one should instead raise an OverflowError or MathDomainError. 

I don't think the distinction here is between "newbies" and "experts", 
there are plenty of experts who dislike NANs and I'm sure I wasn't the 
only newbie that fell in love with NANs when I first learned about them 
way back in the late 1980s using Apple's Hypertalk.


-- 
Steven
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QQ2IOXLLTJW32ZD2YGMOJDK23BCAZ7XN/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to