I started writing up a SortedDict use case I have, but it's very elaborate and I expect it would just end with endless pointless argument about other approaches I _could_ take. But I already know all those ;-)
So let's look at something conceptually "dead easy" instead: priority queues. They're a basic building block in algorithms from many fields. In the distributed Python, `heapq` is the only thing that comes close to being reasonably efficient and scalable for that. However: - It only supports min-heaps. It's not that max-heaps aren't also useful (indeed, _under the covers_ CPython also implements max-heaps for its own internal use). It's more that the original API simply doesn't generalize cleanly. - So there's a body of word-of-mouth "tricks" for making a min-heap "act like" a max-heap. For integer or float priorities, it suffices to use the negation of "the real" priority. For other kinds of values, it gets trickier. In the end, you can wrap each item in a class that implements `x.__lt__(y)` by doing `y.value < x.value`. But few people appear to be aware of that, while nobody wants to endure the bother ;-) - It's often the case that "the priority" needs to be computed from an item's value. `heapq` has no direct support for that. Instead there's another body of word-of-mouth tricks akin to the old decorate-sort-undecorate approach sorting used. Rather than store the values in the heap, you store `(priority, value)` tuples. But then various unpleasant things can happen if two priorities are tied: in that case tuple comparison falls back to comparing the values. But, e.g., value comparison may be very expensive, or the values may not support __lt__ at all. So instead you store `(priority, unique_integer, value)` triples. And hope that you don't really need a max-heap, lest it get even more obscure. Using SortedContainers, none of those are issues. It's all straightforward and obvious. If you have a list L always sorted by priority, extracting from a "min queue" just requires L.pop(0), or from a "max queue" L.pop(-1) (or L.pop() - -1 is the default, same as for the Python list.pop()). No tricks are needed. Fine too if sometimes you want to extract the min, while at other times the max. Or, e.g., peek at the 5 highest-priority values and pick the one that best fits resources available at the time. In cases of computed priorities, the SortedKeyList constructor allows specifying a function to be used to compute the keys used to determine the list's ordering. Again straightforward and obvious. from sortedcontainers import SortedKeyList L = SortedKeyList([(i, j) for i in range(1, 5) for j in range(1, 5)], key=lambda t: t[0]/t[1]) >>> [t for t in L] [(1, 4), (1, 3), (1, 2), (2, 4), (2, 3), (3, 4), (1, 1), (2, 2), (3, 3), (4, 4), (4, 3), (3, 2), (2, 1), (4, 2), (3, 1), (4, 1)] That said, in most of my code I end up _removing_ uses of SortedContainers, in favor of faster ways of getting the job done. The package isn't the last resort for me, it's the first. I can write code in straightforward ways that scale well from the start. If I need more speed later, fine, I can puzzle out faster ways to get the task done. If you're proud of that you _start_ by plotting ways to minimize the number of sorts you do, you're coding in C++, not Python ;-) So I suggest people are staring at the wrong end when asking for use cases that can't be done without the package. Those are necessarily non-trivial. Having a sorted collection is "more than good enough" for many tasks. As above, e.g., there's no reason to imagine that Grant Jenks had priority queues in mind at all, yet the flavors of sorted lists he implemented are very easy to use for that task. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XXAXVV3KJ4ORQ7K6ELSS4R7S5725ASRE/ Code of Conduct: http://python.org/psf/codeofconduct/