On 26/08/2021 19:41, Brendan Barnwell wrote:
On 2021-08-23 20:53, Steven D'Aprano wrote:
So I propose that statistics functions gain a keyword only parameter to
specify the desired behaviour when a NAN is found:
- raise an exception
- return NAN
- ignore it (filter out NANs)
which seem to be the three most common preference. (It seems to be
split roughly equally between the three.)
Thoughts? Objections?
I'd like to suggest that there isn't a single answer that is most
natural for all functions. There may be as few as two.
Guido's proposal was that mean return nan because the naive arithmetic
formula would return nan. The awkward first example was median(), which
is based on order (comparison). Now Brendan has pointed out:
One important thing we should think about is whether to add
similar handling to `max` and `min`. These are builtin functions, not
in the statistics module, but they have similarly confusing behavior
with NAN: compare `max(1, 2, float('nan'))` with `max(float('nan'), 1,
2)`.
The real behaviour of max() is to return the first argument that is not
exceeded by any that follow, so:
>>> max(nan, nan2, 1, 2) is nan
True
>>> max(nan2, nan, 1, 2) is nan2
True
As a definition, that is not as easy to understand as "return the
largest argument". The behaviour is because in Python, x>nan is False.
This choice, which is often sensible, makes the set of float values less
than totally ordered. It seems to me to be an error in principle to
apply a function whose simple definition assumes a total ordering, to a
set that cannot be ordered. So most natural to me would be to raise an
error for this class of function.
Meanwhile, functions that have a purely arithmetic definition most
naturally return nan. Are there any other classes of function than
comparison or arithmetic? Counting, perhaps or is that comparison again?
Proposals for a general solution, especially if based on a replacement
value, are more a question of how you would like to pre-filter your set.
An API could offer some filters, or it may be clearer left to the
caller. It is no doubt too late to alter the default behaviour of
familiar functions, but there could be a "strict" mode.
--
Jeff Allen
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/FQNZLNISKHV74CYJMU2HPG5273VMWXUK/
Code of Conduct: http://python.org/psf/codeofconduct/