FWIW, although no one cares, I "withdraw" my proposed implementation. While it bugs me that I'm not sure what error I made in dealing with duplicate values in an iterable, on reflection I think the whole idea is wrong.
That is, I don't like the weirdness of the behavior of statistics.median. But what I guard against in my partitioning approach isn't every possible comparison of two items anyway. That would always take quadratic time. I just do a bunch of such comparisons according to some particular program flow, but not everything. "Incomparability" can be a property of any pair of objects, in principle. However, I also realize the completely general question is irrelevant. NaNs really are just special in arising innocuously from relatively normal numeric operations. If I make some custom class IncomparableToEverything, it's my problem if I stick it in a list of things I want the median of. So we could get the Pandas-style behavior simply by calling median like so: statistics.median((x for x in it if not math.isnan(x))) I still feel like having median (and friends) do that internally would be worthwhile under some optional parameter. But the default value of that parameter is indeed non-obvious. In a sort of Pandas way of using arguments, we might get `on_nan=["skip"|"poison"|"raise"|"random"]`. "Random" seems like the only wrong answer, but it is the status quo. On Thu, Dec 26, 2019 at 4:34 PM David Mertz <me...@gnosis.cx> wrote: > FWIW, here is a timing: > > >>> many_nums = [randint(10, 100) for _ in range(1_000_000)] > >>> %timeit statistics.median_low(many_nums) > 87.2 ms ± 654 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) > >>> %timeit median(many_nums) > 282 ms ± 3.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) > > I think almost all the slowdown is because `sorted()` is a C function. In > big-O terms, mine should be an improvement since it does part of a > Quicksort in partitioning elements, but it doesn't actually bother sorting > the smaller partition. It *does* make one pass through to find the min > or max though. > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/42GTSIJ6HBGDFTSUMMZDSANFVCHJEIZC/ Code of Conduct: http://python.org/psf/codeofconduct/