Sorry! A previous attempt to reply got sent before I typed anything :-( Very briefly:
> >>> timeit.timeit("set(i for i in range(1000))", number=100_000) [and other examples using a range of integers] The collision resolution strategy for sets evolved to be fancier than for dicts, to reduce cache misses. This is important for sets because the _only_ interesting thing about an element wrt a set is whether or not the set contains it. Lookup speed is everything for sets. If you use a contiguous range of "reasonable" integers for keys, the integer hash function is perfect: there's never a collision. So any such test misses all the work Raymond did to speed set lookups. String keys have sufficiently "random" hashes to reliably create collisions, though. Cache misses can, of course, have massive effects on timing. > Add (much faster for dicts): > >>> timeit.timeit("s = set(); s.add(0)", number=100_000_000) > 13.330938750001224 > >>> timeit.timeit("d = {}; d[0] = None", number=100_000_000) > 5.788865385999088 In the former case you're primarily measuring the time to look up the "add" method of sets (that's more expensive than adding 0 to the set). A better comparison would, e.g., move `add = s.add` to a setup line, and use plain "add(0)" in the loop. That's it! _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5UKYDB36WDRRIYOLRD52QS4I43SCAAJ6/ Code of Conduct: http://python.org/psf/codeofconduct/