Tim Peters wrote:
> RE [Richard Higginbotham higgi...@gmail.com]
......
> Traversing via a set or dict instead reads the objects' data in
> pseudo-random memory order instead, which can give a very high rate of
> cache misses.  In addition, the internal hash table probes also occur
> in pseudo-random order, adding yet another layer of cache misses.
> Plain old lists are in fact pleasant in many ways :-)

Very informative, thank you. I tried different tables sizes and saw that you 
are correct that the table size doesn't matter. I assumed it was collisions 
that were adding the extra time when I increased the size of A and B. You are 
probably right that cache misses are a/the factor. In theory for algebraic set 
operations the size of the left hand argument (that's having its values 
"compared" to the other set) is the limiter. When I saw that time increase 
beyond the size increase of A, I thought it must be the hash functions but I 
didn't really want to say that not knowing the internals. Actually now I wish 
it was as it would easier to account for in judging my algorithm :-).
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ANGHWMIWUJZWOWC37EPL3O4ICA5UZPK5/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to