Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Elliot Gorokhovsky Sun, 05 Mar 2017 19:25:33 -0800

On Sun, Mar 5, 2017 at 8:20 PM David Mertz <me...@gnosis.cx> wrote:

>
> B. I think a very large percentage of lists are heterogeneous.  But most
> of the time when they are, it's not because there are several different
> numeric types but rather because a list collects some sort of custom
> objects.  Maybe those even all share some ancestor, but the point is they
> are user/library defined classes that won't have a fast comparison like
> ints or floats do.
>


As long as the ob_type for all the objects is the same, my patch will sort
significantly faster, as it replaces PyObject_RichCompareBool with
ob_type->tp_richcompare. It doesn't matter if your list is builtin-types or
not, as long as the types are all the same, you get a speedup (though it's
greater for built-in types).

Also, I actually wrote a crawler to see how many PyPI packages implement a
custom compare function (like you would have to for user-defined types).
The answer is less than 6%. Code:
https://github.com/embg/python-fast-listsort/tree/master/crawler


> B (part 2): I haven't actually read his patch, but I'm assuming that
> Elliot's approach can start scanning the list, and as soon as it find a
> complex/custom object at index 0 ignore the rest of the scan.  So for that
> case, there is very little harm in his linear scan (over one item).
>

Yup, the pre-sort check breaks out of the loop as soon as it sees
heterogeneity. So unless your float is at the end of the whole list of ints
(as in my worst-case benchmark), the cost is basically 0.

_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Reply via email to