Maybe a stupid question: What are use cases for sorted dicts?
I don’t think I’ve ever needed one. Also, I can’t quite tell from the discussion If a “sorted dict” implements something new, or is an internal data structure that gives better performance for particular use cases. I.e. is a sorted dict a Mapping? -CHB On Tue, Nov 9, 2021 at 9:45 PM Dan Stromberg <drsali...@gmail.com> wrote: > > On Tue, Nov 9, 2021 at 9:00 PM Steven D'Aprano <st...@pearwood.info> > wrote: > >> Sorting dicts has been discussed on the Python-Ideas mailing list, it is >> too hard and expensive to justify for the limited use-cases for it. If >> you want to sort a dict, you are best to sort the dict's keys, then >> create a new dict. Or possibly use a dedicated Sorted Mapping type like >> a red-black tree or similar. >> > > There are several implementations of key-order sorted dicts available in > Python, and more than a few of them are well tested. I am the maintainer > of one of them: https://pypi.org/project/treap/ which I ported from > Java. I also did a performance comparison among several of them a few > years back. > > Red-black trees are popular, perhaps especially among Java developers, but > I've never understood why. They aren't the fastest. Among traditional > tree-based key-sorted dicts, treaps are frequently fastest. > > However, SortedDict is not uncommonly faster than treaps - I believe this > is because SortedDict is very good at maximizing locality of reference. > Traditional trees are almost always going to do an entire cache line hit > for every node in large trees, even if those nodes are "next to each other" > when sorted/traversed. SortedDicts put keys that are nearly equal, in > nearly the same part of memory - so multiple values can be retrieved with a > single cache line hit. > > Sorting a dict's keys inside a loop tends to give O(n^2) algorithms, and > sometimes even O(n^2 * logn). This is not good. A traditional tree should > give O(nlogn) algorithms under similar circumstances, and although my gut > is telling me SortedDict is similar the presentation linked below suggests > a different (though still better than sorting in a loop) runtime for > SortedDict. > > It's true that key-sorted dicts are not all that common. Their most > common use is probably for implementing finite caches - evictions can > benefit from ordered keys. However, we already have functools.lru_cache. > > Here's my most recent performance comparison: > https://stromberg.dnsalias.org/~strombrg/sorted-dictionary-comparison/Datastructure%20comparison.pdf > > Here's a PyCon 2016 presentation about SortedContainers, that includes > SortedDict: > https://www.youtube.com/watch?v=7z2Ki44Vs4E > > I think it would make sense to include at least one key-sorted dict in > CPython, and feel that SortedDict would not be a bad way to go. Neither > would treap. > > > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/VNK4VPGA4LJH7RMR4LCCACEG2WNDBBFO/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/U7FQIE5NDWY3AQI5QDMIJSIKVIQ3YUPC/ Code of Conduct: http://python.org/psf/codeofconduct/