On Mon, Apr 08, 2019 at 02:55:54AM -0700, Nathaniel Smith wrote: > Right, by "doesn't affect" I meant "cannot get any benefit, even if their > code is modified".
Ah, sorry I misunderstood you. > > # This only makes sense if data is a sequence (list) > > # not an iterator. > > quartiles = quantiles(data, n=4) > > quintiles = quantiles(data, n=5) > > deciles = quantiles(data, n=10) > > percentiles = quantiles(data, n=100) > > > > If only we had some kind of API that could compute multiple quantiles at > the same time... You mean something like this? quartiles, quintiles, deciles, percentiles = quantiles(data, n=(4, 5, 10, 100)) Yuck :-) I'd rather allow an already_sorted=True parameter :-) > > I'm perfectly comfortable with allowing the caller to lie if they want. > > Its their own foot they're shooting. > > > > An already_sorted=True argument would be an explicit opt in, and consenting > adults would apply. But your message was very explicit that __issorted__ > can be set implicitly, though. For example, this would give garbage results: > > # implicitly sets the sorted flag > data.sort() > # preserves the flag, because hey it's sorted by *some* key > data.reverse() > statistics.median(data) It would certainly be putting more responsibility on the caller to ensure the sorted flag was correct. > You can't use this in statistics.median because it would break > compatibility. I would argue differently. > Also, isn't the whole point of 'statistics' to be the > simple, reliable module for folks who aren't that worried about speed? This > seems like a massive footgun. Perhaps. Perhaps not as massive as it seems. The expected use-case would be something like this: data = get_data() a = median(data) data.sort() b = median(data) # Like a, but faster. To be a problem, the caller would need to do something like: data = get_data() a = median(data) data.sort(reversed=True) # or some weird key function b = median(data) # Garbage result. I can't see people shooting themselves in the foot in this way by accident very often. But fair enough, it is a risk. > > (I wouldn't be so blasé about this if it were a function written in C > > that could segfault if the list wasn't sorted.) > > > > Silently giving the wrong answer is way worse than a segfault. It depends on what you're worried about, and who gets the blame for the wrong answer. As I understand it, it is the position of the core-devs that *any* seg fault in the interpreter or stdlib is a serious bug that must be fixed (possibly excepting ctypes); but if Python code returns garbage when you give it garbage input, that may or may not be considered a bug. In this case, passing a list with the flag set when it is not actually sorted correctly would be a "Garbage In, Garbage Out" error, just as if they had explicitly passed a already_sorted=True argument. But I take your earlier point that the argument version is explicit and opt-in, while the flag is implicit and may not be opt-in. Given all the points already raised, I think that an explicit SortedList might be more appropriate. Thanks everyone for all the feedback. -- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/