[Steven D'Aprano <st...@pearwood.info>] > Sorting doesn't require a total order. Sorting only requires a weak > order where the only operator required is the "comes before" operator, > or less than. That's precisely how sorting in Python is implemented.
Let's be careful here. Python's list.sort() guarantees that _if_ the elements are totally orderable, it will deliver the unique stable permutation consistent with that ordering. Otherwise, the only thing it guarantees is that the output will be _some_ permutation of the input. There was no attempt to do something "useful" in the absence of a total order - whatever happens then is purely down to implementation accidents. The reason is sticks to __lt__ is pragmatic, not theoretical: at the time it was designed, rich comparisons weren't yet implemented, and all comparisons were implemented via __cmp__ methods. But rich comparisons were scheduled for a future release, and sticking to __lt__ was my deliberate nod to making it as easy as possible then for classes that wanted to define their own order to participate in sorting. Implement __lt__, and you're good to go. I don't think it's documented, but the guts of the bisect and heapq modules were reworked (I think by Raymond) to stick to __lt__ too, for the same reason. But, e.g., >>> sorted([{1, 2}, {3}, {1}]) [{1, 2}, {3}, {1}] despite that {1} < {1, 2} As far as sorted() is concerned, that's "garbage in garbage out". What happens today (and since the current list.sort() was written - but may not be so tomorrow): First it asks whether {3} < {1, 2}. It gets back False, so concludes the first two elements are already in order. Then it asks whether {1} < {3}. Again False, so it concludes those two are also in order. And that's it. It's done. It never compares {1, 2} to {1} at all. By comparing only adjacent pairs of elements, it concludes that the entire input consists of one non-descending run. For totally ordered input elements, that means "it's already sorted". > Here is an interesting discussion of a practical use-case of sorting > data with a partial order: > > https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-am-forking-the-compiler/ list.sort() is of no use for finding a linear ordering that respects a collection of."A must precede B" constraints But Python 3.9 introduces a capable new functools.TopologicalSorter class that is good for that. Although, unlike the approach in that blog, it doesn't cater to cycles - if there's a cycle in the input, it raises CycleError. >> but we really need to use a >> type that has these exceptional values. Imagine that sort/median was >> defined to type check its parameter, > No need to imagine it, sort already type-checks its arguments: > > py> sorted([1, 3, 5, "Hello", 2]) > TypeError: '<' not supported between instances of 'str' and 'int' Yes and no ;-) list.sort() makes no attempt to check element types for the sake of checking them. In the example, it's just passing on an exception raised by asking whether "Hello" < 5 (as in the example above, it first compares adjacent pairs to identify the longest initial run). If, e.g., we sorted a list like [a, b, c], where (b < a) == (c < b) == False it would stop then, even if comparing `a` with `c` would raise an exception. > [snipped stuff about NaNs] _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IY6C4E3M5472IDPDG24KYNLK2AZNW3DI/ Code of Conduct: http://python.org/psf/codeofconduct/