Well, the other element for me is that I would never want an order GUARANTEE to preclude a later every faster version.
I know there are different test scenarios, but I think the first step for it to be remotely reasonable is an ordered implementation that is faster in any reasonable test cases. For my own uses, I typically use sets with hundreds, maybe thousands of elements. Often strings. With tens of elements I care about speed less. I myself rarely work with sets of millions of elements. On Mon, Aug 24, 2020, 7:25 PM Cade Brown <brown.c...@gmail.com> wrote: > When I quoted 10% as a rough measurement I was referring more to set > operations like union, etc, which choosing the larger set may have a very > real performance advantage. > > > For membership tests which are false, it may be quicker, as less memory is > accessed. The buckets will be smaller (1,2,4, or 8 bytes vs 8 or 16 (32bit > vs 64bit). But, to compare the entry's hash, an indirection into the > entries array is required... The hueristics become less clear when > considering how many collisions there are. It may also change for different > tuning parameters (i.e. a lower load factor would alleviate the pointer > dereferences require to compare items, but at the cost of more memory). > > It may very well be the case that the target load factor only decreases a > small amount (so that the ordered implementation still uses less memory), > but the performance benefits of quick rejections make overall performance > better. I don't know how 'objective' we can be here without discussing > specific data sets people may be using. > > On Mon, Aug 24, 2020, 7:15 PM David Mertz <me...@gnosis.cx> wrote: > >> The main purpose of sets is FAST membership testing. Everything I've seen >> in prior discussions convinced me that preserving insertion order would >> slow that down. >> >> If membership testing lost even 10% speed, I would be -1000 on the idea. >> If someone comes up with actual working code that is no more than 1% >> slower, is move to -0. >> >> If speed miraculously got faster in such a rewrite (analogous with >> dictionaries whose order was initially a happy side-effect, but I've seen >> explanations of why that wouldn't happen) I'd be +1. >> >> But in the last case, the entirely of my support would be for the speed >> up, and exactly none of it would be for the ordering. >> >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/K7LSURQ7SCJQTN2NYGK3U5OTIZBXQDCO/ Code of Conduct: http://python.org/psf/codeofconduct/