[Python-ideas] Re: Fix statistics.median()?

Richard Damon Thu, 26 Dec 2019 09:26:33 -0800

On 12/26/19 10:31 AM, David Mertz wrote:

This came up in discussion here before, maybe a year ago, I think. There was a decision not to change the implementation, but that seemedlike a mistake (and the discussion was about broader things).
Anyway, I propose that the obviously broken version of`statistics.median()` be replaced with a better implementation.
Python 3.8.0 (default, Nov  6 2019, 21:49:08)
>>> import numpy as np
>>> import pandas as pd
>>> import statistics
>>> nan = float('nan')
>>> items1 = [nan, 1, 2, 3, 4]
>>> items2 = [1, 2, 3, 4, nan]
>>> statistics.median(items1)
2
>>> statistics.median(items2)
3
>>> np.median(items1)
nan
>>> np.median(items2)
nan
>>> pd.Series(items1).median()
2.5
>>> pd.Series(items2).median()
2.5
The NumPy and Pandas answers are both "reasonable" under slightlydifferent philosophies of how to handle bad values. I think raising anexception for NaNs would also be reasonable enough.
The one thing that is NOT reasonable is returning different answersfor median depending on the order of the elements.

Getting garbage answers for garbage input isn't THAT unreasonable.Perhaps it could be argued that detecting common garbage input andrejecting it (perhaps with an exception) would make more sense.

Note that the statistics module documentation implies the issue, asmedian implies that it requires the sequence to be orderable, and nanisn't orderable. Since the statistics module seems to be designed tohandle types other than floats, detecting nan values is extra expensive,so I think it can be excused for not checking.


--
Richard Damon
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/OPPWCFJ7UDHXL5WEXXADXMRQFTJHEFPX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Fix statistics.median()?

Reply via email to