> On 8 Mar 2017, at 22:58, Elliot Gorokhovsky <elliot.gorokhov...@gmail.com>
> wrote:
>
> On Wed, Mar 8, 2017 at 2:14 PM Barry <ba...@barrys-emacs.org
> <mailto:ba...@barrys-emacs.org>> wrote:
> Can you assume that list of of type(list[0]) and use that type's optimised
> sort?
> But in the optimised sort code check that the types are as required.
> If you hit an element that is not of the required type then fall back to the
> unoptimised sort.
>
> Well, how would you tell if you've hit an element that is not of the required
> type? You'd have to check during every compare, right? And that's precisely
> what we're trying to avoid!
What it seemed the trick for optimisation is is to compare the type pointer of
an object to see if its the same as a type supported by the chosen optimised
sort.
It was not clear to me that you need to scan the list at the start to make sure
its homogeneous. Given that the type check is so cheap will it
slow the sort if you do the pointer check in the compare code? I am not
suggesting you run rich compare full fat on each compare.
> The whole point of my patch is that we do O(nlogn) compares, but only have
> O(n) elements, so it's much cheaper to do all the type checks in advance, and
> in the very likely case that our list is homogeneous, switch to an optimized
> special-case compare function.
So you do O(nlogn)*2 pointer compares with my suggestion it seems? Which is
smaller the O(n) pointer checks?
>
> Even when we only do O(n) compares, my patch is still much faster (see
> benchmarks higher up in this thread). Why? Well, if you're doing the type
> checks during the compares, you're doing them across different function
> calls, with other stuff interspersed in between. So pipeline/branch
> prediction/caching is less able to optimize away the overhead of the safety
> checks (I don't know how CPUs work, but one of those is relevant here). With
> the pre-sort check, it's all in a single iteration through the list, and
> we're taking the same exact branch every time; it's much faster. Think
> iterating over a matrix row-wise versus iterating column-wise (not exactly
> relevant here since that's about cache, but that's the idea. Again, I don't
> really know how CPUs work).
Provided the code is small I think both versions of the algorithm will benefit
from cache and branch prediction.
>
> So in short: no, we can't just check as we go, because that's what we're
> already doing. But we can optimize significantly by checking in advance. I
> mean, practically speaking, the advance check is basically free. The
> compare-time checks, in sum, are not, both asymptotically and practically.
I not clear this is a true. But I have not read the code.
Barry
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/