[Tim Delaney <timothy.c.dela...@gmail.com>]
> Isn't there already always a scan of the iterable to build the keys array
> for sorting (even if no key keyword param is specified)?

No - `.sort()` is a list method, and as such has nothing to do with
arbitrary iterables, just lists (perhaps you're thinking of the
`sorted()` function?).  If no `key=` argument is passed, the list guts
itself is used (as-is) as the vector of keys.


> In which case adding the homogenity check there seems like it shouldn't
> add much overhead at all (I have to say that I was surprised with 10+%
> reductions in speed in some of the heterogenous TimSort tests for this 
> reason).

Those are worst cases where the current sort does very few compares
(like just N-1 for a list of length N).  Because they do "amazingly"
few compares, they're already "amazingly" fast.  And they also do
little data movement (e.g., none at all for /sort & =sort, and N//2
pointer swaps for \sort).  Because of that any new O(N) overhead would
make them significantly slower - unless the new overhead pays off by
allowing a larger time saving than it costs.(as it does when the list
is same-type).

There is a "natural" place to insert "same type?" checks:  the outer
loop of the sort marches over the vector once, left to right,
alternately identifying the next natural run, then possibly extending
it and/or merging it into previous runs.  The checks could be put
there instead, but the code would be ugly and more invasive - I
wouldn't even bother trying it.


> And could specific richcompares be refactored so there was a "we really know
> what the types are is, no need to check" version available to sort() (with
> the typechecking version available for general use/unoptimised sorting)?

They're not _just_ type-aware.  For example, the special function for
ints is specialized to integers that happen to fit in one (internal)
"digit", and the special function for strings is specialized to those
that happen to be stored in PyUnicode_1BYTE_KIND format.  Adding such
stuff to the public C API would be ... well, for a start, tedious ;-)
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to