Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Barry Scott Thu, 09 Mar 2017 02:05:50 -0800

> On 8 Mar 2017, at 22:58, Elliot Gorokhovsky <[email protected]> 
> wrote:
> 
> On Wed, Mar 8, 2017 at 2:14 PM Barry <[email protected] 
> <mailto:[email protected]>> wrote:
> Can you assume that list of of type(list[0]) and use that type's optimised 
> sort?
> But in the optimised sort code check that the types are as required.
> If you hit an element that is not of the required type then fall back to the 
> unoptimised sort.
> 
> Well, how would you tell if you've hit an element that is not of the required 
> type? You'd have to check during every compare, right? And that's precisely 
> what we're trying to avoid!


What it seemed the trick for optimisation is is to compare the type pointer of 
an object to see if its the same as a type supported by the chosen optimised 
sort.

It was not clear to me that you need to scan the list at the start to make sure 
its homogeneous. Given that the type check is so cheap will it
slow the sort if you do the pointer check in the compare code? I am not 
suggesting you run rich compare full fat on each compare.

> The whole point of my patch is that we do O(nlogn) compares, but only have 
> O(n) elements, so it's much cheaper to do all the type checks in advance, and 
> in the very likely case that our list is homogeneous, switch to an optimized 
> special-case compare function.

So you do O(nlogn)*2 pointer compares with my suggestion it seems? Which is 
smaller the O(n) pointer checks?

> 
> Even when we only do O(n) compares, my patch is still much faster (see 
> benchmarks higher up in this thread). Why? Well, if you're doing the type 
> checks during the compares, you're doing them across different function 
> calls, with other stuff interspersed in between. So pipeline/branch 
> prediction/caching is less able to optimize away the overhead of the safety 
> checks (I don't know how CPUs work, but one of those is relevant here). With 
> the pre-sort check, it's all in a single iteration through the list, and 
> we're taking the same exact branch every time; it's much faster. Think 
> iterating over a matrix row-wise versus iterating column-wise (not exactly 
> relevant here since that's about cache, but that's the idea. Again, I don't 
> really know how CPUs work).

Provided the code is small I think both versions of the algorithm will benefit 
from cache and branch prediction.

> 
> So in short: no, we can't just check as we go, because that's what we're 
> already doing. But we can optimize significantly by checking in advance. I 
> mean, practically speaking, the advance check is basically free. The 
> compare-time checks, in sum, are not, both asymptotically and practically.

I not clear this is a true. But I have not read the code.

Barry

_______________________________________________
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Reply via email to