It surely would be helpful. I still find it a bit too single-case oriented.

Maybe having an equivalent __sorting_algo__ property with a value of the current sorting algorithm would be more general? There could be a SortingAlgo base class, which could be extended into classes like:  - SortingAlgoNone(SortingAlgo) or SortingAlgoUnsorted(SortingAlgo) which would be the default for non-sorted lists (or just the value None)
 - SortingAlgoAscending(SortingAlgo)
 - SortingAlgoAscendingNumbers(SortingAlgoAscending)
 - MyCustomSortingAlgo(SortingAlgo)
 - ...
It would allow to mark a list as sorted with any algorithm, and of course, any code that would use these lists would be able to read/write this __sorting_algo__.

And complementary idea, we could have an extra arg to sort() (and other functions like this one) like `trust_declared_algo=True`, that if set would only sort the list if its list.__sorting_algo__ is compatible (a subclass of the sorting algo it uses, or the class itself).

The rest of the behaviours (when the __sorting_algo__ would be set or reset) would be as described by Steven in the original proposal.

-Brice


Le 8/4/19 à 4:32, Steven D'Aprano a écrit :
There are quite a few important algorithms which require lists to be
sorted. For example, the bisect module, and for statistics median and
other quantiles.

Sorting a list is potentially expensive: while Timsort is very
efficient, it is still ultimately an O(N log N) algorithm which limits
how efficient it can be. Checking whether a list is sorted is O(N).

What if we could check that lists were sorted in constant time?

Proposal: let's give lists a dunder flag, __issorted__, that tracks
whether the list is definitely sorted or not:

- Empty lists, or lists with a single item, are created with
   __issorted__ = True; lists with two or more items are created
   with the flag set to False.

- Appending or inserting items sets the flag to False.

- Deleting or popping items doesn't change the flag.

- Reversing the list doesn't change the flag.

- Sorting it sets the flag to True. (The sort method should NOT
   assume the list is sorted just because the flag is set.)

Functions that require the list to be sorted can use the flag as a
quick check:

     if not alist.__issorted__:
         alist.sort()
     ...

The flag will be writable, so that functions such as those in
bisect can mark that they have kept the sorted invariant:


     bisect.insort(alist, x)
     assert alist.__issorted__


Being writable, the flag is advisory, not a guarantee, and "consenting
adults" applies. People can misuse the flag:

     alist = [1, 4, 2, 0, 5]
     alist.__issorted__ = True

but then they have nobody to blame but themselves if they shoot
themselves in the foot. That's no worse than the situation we have now,
were you might pass an unsorted list to bisect.

The flag doesn't guarantee that the list is sorted the way you want
(e.g. biggest to smallest, by some key, etc) only that it has been
sorted. Its up to the user to ensure they sort it the right way:

     # Don't do this and expect it to work!
     alist.sort(key=random.random)
     bisect.insort(alist, 1)


If you really want to be sure about the state of the list, you have to
make a copy and sort it. But that's no different from the situation
right now. But for those willing to assume "consenting adults", you
might trust the flag and avoid sorting.

Downsides:

- Every list grows an extra attribute; however, given that lists are
   already quite big data structures and are often over-allocated, I
   don't think this will matter much.

- insert(), append(), extend(), __setitem__() will be a tiny bit
   slower due to the need to set the flag.



Thoughts?




_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to