David Mertz wrote: > The behavior of your sort function is not any of the desirable options. > Moving NaNs to the end is not the widely used Panda style of removing them
...Mertz, you are really hardheaded.... I supported **all** the option of your lovely Pandas, that supports also poisoning, and you say you don't like poisoning? > I cannot think of any situation where that behavior would be useful Think about this: you have a population of 1 million of people. You want to take the median of their heart rate. But for some reason, your calculations gives you some NaN. If you remove the NaNs, it's like you remove people from your statistics. And since the median is the central value of the population, you're faking the result. Is it clear now? > Other than wastefully creating an eager list, your filter is the same as I > suggested for the one behavior. .....I've really not understood what you're writing. > In general, passing a sort function, while powerful, is terrible API design > for the non-experts who are main users of statistics module. ......non-experts does not need to pass anything, since the default is `sorted`. Non-experts probably does not even know what to pass to that parameter. Non-experts can also read the docs, you know?!? Anyway, I'm not here to convince you of anything. Continue to use Pandas, the most slow module in the history of Python, that my company asked to me to remove to all their code because it slow down their applications like an old man with the hat driving a camion in a street, and be happy. Sayonara. > On Thu, Dec 26, 2019, 10:42 PM Marco Sulla via Python-ideas > python-ideas@python.org wrote: > > Oh my... Mertz, listen to me, you don't need a > > parameter. You only need a > > key function to pass to sorted() > > If you use this key function: > > https://mail.python.org/archives/list/python-ideas@python.org/message/M3DEOZ... > > in my median() function: > > https://mail.python.org/archives/list/python-ideas@python.org/message/KN6BSM... > > you can simply do: > > median(iterable, key=iliadSort) > > > > and you have "poisoned" your data. > > If you want to use the sorted iterable again later in the code, you can > > just do: > > sorted_it = sorted(iterable, key=iliadSort) > > median(sorted_it, sort_fn=None) > > > > I prefer this approach, since this way you can avoid potentially > > re-sorting of your data. > > For the same reason, if you want to remove the NaNs, it's better to create > > an iterable apart instead of creating it on the fly, because you can > > reutilize it: > > filtered_it = [x for x in iterable if not math.isnan(x)] > > median(filtered_it) > > > > If you want to raise an error, you can use another key function: > > class NanError(ValueError): > > pass > > > > def alertNan(x): > > if math.isnan(x): > > raise NanError("There a NaN in my dish!") > > > > return x > > > > and then use it: > > median(iterable, key=alertNan) > > > > But if you absolutely want a nan parameter, you can create a wrapper for > > sorted: > > def nansorted(iterable, on_nan="poison", **kwargs): > > if on_nan == "poison": > > return sorted(iterable, key=iliadSort, **kwargs) > > > > if on_nan == "remove": > > new_iterable = [x for x in iterable if not math.isnan(x)] > > return sorted(new_iterable, **kwargs) > > > > if on_nan == "raise": > > return sorted(iterable, key=alertNan, **kwargs) > > > > raise ValueError(f"Unsupported on_nan parameter value: {on_nan}") > > > > and then use it in my median(): > > median(iterable, sort_fn=nansorted, on_nan="raise") > > > > or, as before > > sorted_it = nansorted(iterable, on_nan="poison") > > median(sorted_it, sort_fn=None) > > > > _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5EDFJ4CTV5QLPHAYR7B2DAINYP6LDJPS/ Code of Conduct: http://python.org/psf/codeofconduct/