[Python-ideas] Re: Reliable Ordering Of Sets

David Mertz Mon, 24 Aug 2020 16:35:45 -0700

Well, the other element for me is that I would never want an order
GUARANTEE to preclude a later every faster version.


I know there are different test scenarios, but I think the first step for
it to be remotely reasonable is an ordered implementation that is faster in
any reasonable test cases.

For my own uses, I typically use sets with hundreds, maybe thousands of
elements. Often strings. With tens of elements I care about speed less. I
myself rarely work with sets of millions of elements.

On Mon, Aug 24, 2020, 7:25 PM Cade Brown <[email protected]> wrote:

> When I quoted 10% as a rough measurement I was referring more to set
> operations like union, etc, which choosing the larger set may have a very
> real performance advantage.
>
>
> For membership tests which are false, it may be quicker, as less memory is
> accessed. The buckets will be smaller (1,2,4, or 8 bytes vs 8 or 16 (32bit
> vs 64bit). But, to compare the entry's hash, an indirection into the
> entries array is required... The hueristics become less clear when
> considering how many collisions there are. It may also change for different
> tuning parameters (i.e. a lower load factor would alleviate the pointer
> dereferences require to compare items, but at the cost of more memory).
>
> It may very well be the case that the target load factor only decreases a
> small amount (so that the ordered implementation still uses less memory),
> but the performance benefits of quick rejections make overall performance
> better. I don't know how 'objective' we can be here without discussing
> specific data sets people may be using.
>
> On Mon, Aug 24, 2020, 7:15 PM David Mertz <[email protected]> wrote:
>
>> The main purpose of sets is FAST membership testing. Everything I've seen
>> in prior discussions convinced me that preserving insertion order would
>> slow that down.
>>
>> If membership testing lost even 10% speed, I would be -1000 on the idea.
>> If someone comes up with actual working code that is no more than 1%
>> slower, is move to -0.
>>
>> If speed miraculously got faster in such a rewrite (analogous with
>> dictionaries whose order was initially a happy side-effect, but I've seen
>> explanations of why that wouldn't happen) I'd be +1.
>>
>> But in the last case, the entirely of my support would be for the speed
>> up, and exactly none of it would be for the ordering.
>>
>

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/K7LSURQ7SCJQTN2NYGK3U5OTIZBXQDCO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Reliable Ordering Of Sets

Reply via email to