Maybe a stupid question:

What are use cases for sorted dicts?

I don’t think I’ve ever needed one.

Also, I can’t quite tell from the discussion If a “sorted dict” implements
something new, or is an internal data structure that gives better
performance for particular use cases. I.e. is a sorted dict a Mapping?

-CHB



On Tue, Nov 9, 2021 at 9:45 PM Dan Stromberg <drsali...@gmail.com> wrote:

>
> On Tue, Nov 9, 2021 at 9:00 PM Steven D'Aprano <st...@pearwood.info>
> wrote:
>
>> Sorting dicts has been discussed on the Python-Ideas mailing list, it is
>> too hard and expensive to justify for the limited use-cases for it. If
>> you want to sort a dict, you are best to sort the dict's keys, then
>> create a new dict. Or possibly use a dedicated Sorted Mapping type like
>> a red-black tree or similar.
>>
>
> There are several implementations of key-order sorted dicts available in
> Python, and more than a few of them are well tested.  I am the maintainer
> of one of them: https://pypi.org/project/treap/ which I ported from
> Java.  I also did a performance comparison among several of them a few
> years back.
>
> Red-black trees are popular, perhaps especially among Java developers, but
> I've never understood why.  They aren't the fastest.  Among traditional
> tree-based key-sorted dicts, treaps are frequently fastest.
>
> However, SortedDict is not uncommonly faster than treaps - I believe this
> is because SortedDict is very good at maximizing locality of reference.
> Traditional trees are almost always going to do an entire cache line hit
> for every node in large trees, even if those nodes are "next to each other"
> when sorted/traversed.  SortedDicts put keys that are nearly equal, in
> nearly the same part of memory - so multiple values can be retrieved with a
> single cache line hit.
>
> Sorting a dict's keys inside a loop tends to give O(n^2) algorithms, and
> sometimes even O(n^2 * logn).  This is not good.  A traditional tree should
> give O(nlogn) algorithms under similar circumstances, and although my gut
> is telling me SortedDict is similar the presentation linked below suggests
> a different (though still better than sorting in a loop) runtime for
> SortedDict.
>
> It's true that key-sorted dicts are not all that common.  Their most
> common use is probably for implementing finite caches - evictions can
> benefit from ordered keys.  However, we already have functools.lru_cache.
>
> Here's my most recent performance comparison:
> https://stromberg.dnsalias.org/~strombrg/sorted-dictionary-comparison/Datastructure%20comparison.pdf
>
> Here's a PyCon 2016 presentation about SortedContainers, that includes
> SortedDict:
> https://www.youtube.com/watch?v=7z2Ki44Vs4E
>
> I think it would make sense to include at least one key-sorted dict in
> CPython, and feel that SortedDict would not be a bad way to go.  Neither
> would treap.
>
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VNK4VPGA4LJH7RMR4LCCACEG2WNDBBFO/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U7FQIE5NDWY3AQI5QDMIJSIKVIQ3YUPC/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to