Even though I was the first person in this thread to suggest collections.OrderedSet, I'm "meh" about it now. As I read more and played with the sortedcollections package, it seemed to me that while I might want a set that iterated in a determinate and meaningful order moderately often, insertion order would make up a small share of those use cases.
On Mon, Dec 23, 2019, 8:05 PM Kyle Stanley <aeros...@gmail.com> wrote: > Larry Hastings wrote: > > a hypothetical collections.OrderedSet would probably work just fine. > > I'm also in agreement with adding a collections.OrderedSet. There > certainly seems to be a practical use case for a Set that preserves > insertion order, but with significant push back to modifying the existing > Set implementation (primarily due to performance concerns). > > If later down the road an improvement to OrderedSet is made that gives it > equal or better average performance across the board compared to Set, we > can consider changing the default implementation *if* there's significant > demand for it. > > Larry Hastings wrote: > > And he'd probably use it too, as that makes the code clearer / easier to > read. If you read code using an OrderedSet, you know what the intent was > and what the semantics are. If you see code using a dict but setting the > values to None, you think "that's crazy" and now you have a puzzle to > solve. Let's hope those comments are accurate! > > This is an excellent point. My first thought if someone were to be using a > Dict with all values set to None would be: why not use a Set here? As Larry > said, the need for preservation of insertion order would have to be > explicitly described in the code comments. Realistically speaking, code > comments are not something that can be consistently relied upon, especially > if it's part of some boilerplate format that gets replicated or if the > container is used in multiple locations. I'd much rather have a dedicated > data structure that clearly describes what it does based on the name alone, > IMO that's a million times better for readability purposes. > > Also, this is mostly speculation since I haven't ran any benchmarks for an > OrderedSet implementation, but I strongly suspect that OrderedSet will end > up having better average performance for add, pop, and update than trying > to use a dictionary with None values (assuming it's implemented in C or at > least with a C accelerator). > > Not to mention allowing the ability to update (for both addition and > removal) based on intersections and unions across ordered sets, which > currently isn't possible to do with dictionaries (at least not directly > with |=, &=, -=. and ^=). I'm not sure how much of a practical use case > this would have for ordered sets specifically, but it's a nice bonus. > > On Sat, Dec 21, 2019 at 7:59 AM Larry Hastings <la...@hastings.org> wrote: > >> >> (mixing together messages in the thread, sorry threaded-view readers) >> >> On 12/19/19 3:15 PM, Tim Peters wrote: >> >> [Nick] >> >> I took Larry's request a slightly different way: >> >> [...] >> >> Only Larry can answer whether that would meet his perceived need. My >> _guess_ is that he wouldn't know OrderedSet existed, and, even if he >> did, he'd use a dict with None values anyway (because it's less hassle >> and does everything he wanted). >> >> >> At last, something I can speak knowledgeably about: Larry's use case! >> Regarding Larry, I'd say >> >> 1. his use case was small enough that almost anything maintaining >> order would have worked, even a list, >> 2. he often *does have* a pretty good idea what goodies are salted >> away in the Python standard library, and >> 3. a hypothetical collections.OrderedSet would probably work just >> fine. And he'd probably use it too, as that makes the code clearer / >> easier to read. If you read code using an OrderedSet, you know what the >> intent was and what the semantics are. If you see code using a dict but >> setting the values to None, you think "that's crazy" and now you have a >> puzzle to solve. Let's hope those comments are accurate! >> >> >> Also, speaking personally, at least once (maybe twice?) in this thread >> folks have asked "would YOU, Mr Larry, really want ordered sets if it meant >> sets were slightly slower?" >> >> The first thing I'd say is, I'm not sure why people should care about >> what's best for me. That's sweet of you! But you really shouldn't. >> >> The second thing is, my needs are modest, so the speed hit we're talking >> about would likely be statistically insignificant, for me. >> >> And the third thing is, I don't really put the set() API through much of >> a workout anyway. I build sets, I add and remove items, I iterate over >> them, I do the occasional union, and only very rarely do I use anything >> else. Even then, when I write code where I reach for a fancy operation >> like intersection or symmetric_difference--what tends to happen is, I have >> a rethink and a refactor, and after I simplify my code I don't need the >> fancy operations anymore. I can't remember the last time I used one of >> these fancy operations where the code really stuck around for a long time. >> >> So, personally? Sure, I'd take that tradeoff. As already established, I >> like that dict objects maintain their order, and I think it'd be swell if >> sets maintained their order too. I suspect the performance hit wouldn't >> affect *me* in any meaningful way, and I could benefit from the >> order-maintaining semantics. I bet it'd have other benefits too, like >> making regression tests more stable. And my (admittedly uninformed!) >> assumption is, the loss of performance would mostly be in these >> sophisticated operations that I never seem to use anyway. >> >> But I don't mistake my needs for the needs of the Python community at >> large. I'd be mighty interested to read other folks coming forward and >> saying, for example, "please don't make set objects any slower!" and >> talking us through neat real-world use cases. Bonus points if you include >> mind-boggling numbers and benchmarks! >> >> >> On 12/20/19 5:09 AM, Peter Wang wrote: >> >> As Larry pointed out earlier, ordered dicts posed a problem for >> MicroPython. >> >> Just a minor correction: that wasn't me, that was Petr Viktorin. >> >> >> Ten days left until we retire 2.7, >> >> >> */arry* >> _______________________________________________ >> Python-Dev mailing list -- python-dev@python.org >> To unsubscribe send an email to python-dev-le...@python.org >> https://mail.python.org/mailman3/lists/python-dev.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-dev@python.org/message/JYSJ55CWUE2FFK2K2Y75EWE7R7M6ZDGG/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/2KEAXZBQP6YJ2EV4YURQZ7NFFRJZQRFH/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3U5IUE7RFXHMSP6BQ4ABQ3YBP2BAEVPA/ Code of Conduct: http://python.org/psf/codeofconduct/