subject:"\[Python\-ideas\] Exploiting type\-homogeneity in list.sort\(\) \(again\!\)"

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-10 Thread Koos Zevenhoven

On Mon, Mar 6, 2017 at 4:45 AM, Steven D'Aprano  wrote:
> On Sun, Mar 05, 2017 at 07:19:43AM +, Elliot Gorokhovsky wrote:
>
>> You may remember seeing some messages on here about optimizing list.sort()
>> by exploiting type-homogeneity: since comparing apples and oranges is
>> uncommon (though possible, i.e. float to int), it pays off to check if the
>> list is type-homogeneous
>
> I sometimes need to know if a list is homogenous, but unfortunately
> checking large lists for a common type in pure Python is quote slow.
>
> Here is a radical thought... why don't lists track their common type
> themselves? There's only a few methods which can add items:
>
> - append
> - extend
> - insert
> - __setitem__
>

I can also imagine other places where knowing those one or two bits of
information about homogeneity might potentially allow speedups:
converting lists or tuples to numpy arrays, min, max, sum etc.

If this extra one or two bits of information were tracked, and the
overhead of doing that was very small, then operations over the
collections could benefit regardless of the complexity of the
algorithm, so also O(n) operations.

Too bad that one common thing done with lists – iterating – does not
have obvious benefits from type homogeneity in CPython.

—Koos


> Suppose we gave lists a read-only attrribute, __type_hint__, which
> returns None for hetrogeneous lists and the type for homogeneous lists. Adding
> an item to the list does as follows:
>
> - if the list is empty, adding an item sets __type_hint__ to type(item);
> - if the list is not empty, adding an item tests whether type(item)
>   is identical to (not a subclass) of __type_hint__, and if not, sets
>   __type_hint__ to None;
> - removing an item doesn't change the __type_hint__ unless the list
>   becomes empty, in which case it is reset to None;
> - if the internal allocated space of the list shrinks, that triggers
>   a recalculation of the __type_hint__ if it is currently None.
>
> (There's no need to recalculate the hint if it is not None.)
>
> Optional: doing a list.sort() could also recalculate the hint.
>
>
> The effect will be:
>
> - if __type_hint__ is a type object, then you can be sure that
>   the list is homogeneous;
>
> - if the __type_hint__ is None, then it might still be
>   homogeneous, but it isn't safe to assume so.
>
> Not only could sorting take advantage of the type hint without needing
> to do a separate O(N) scan of the list, but so could other code. I know
> I would be interested in using this. I have a fair amount of code that
> has to track the type of any items seen in a list, and swap to a "type
> agnostic but slow" version if the list is not homogeneous. I could
> probably replace that with some variation of:
>
> if thelist.__type_hint__ is None:
> process_slow(thelist)
> else:
> process_fast(thelist)
>
>
> At the very least, I'd be interested in experimenting with this.
>
> Thoughts?
>
>
>
> --
> Steve
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-09 Thread Tim Peters

[Tim Delaney ]
> Isn't there already always a scan of the iterable to build the keys array
> for sorting (even if no key keyword param is specified)?

No - `.sort()` is a list method, and as such has nothing to do with
arbitrary iterables, just lists (perhaps you're thinking of the
`sorted()` function?).  If no `key=` argument is passed, the list guts
itself is used (as-is) as the vector of keys.


> In which case adding the homogenity check there seems like it shouldn't
> add much overhead at all (I have to say that I was surprised with 10+%
> reductions in speed in some of the heterogenous TimSort tests for this 
> reason).

Those are worst cases where the current sort does very few compares
(like just N-1 for a list of length N).  Because they do "amazingly"
few compares, they're already "amazingly" fast.  And they also do
little data movement (e.g., none at all for /sort & =sort, and N//2
pointer swaps for \sort).  Because of that any new O(N) overhead would
make them significantly slower - unless the new overhead pays off by
allowing a larger time saving than it costs.(as it does when the list
is same-type).

There is a "natural" place to insert "same type?" checks:  the outer
loop of the sort marches over the vector once, left to right,
alternately identifying the next natural run, then possibly extending
it and/or merging it into previous runs.  The checks could be put
there instead, but the code would be ugly and more invasive - I
wouldn't even bother trying it.


> And could specific richcompares be refactored so there was a "we really know
> what the types are is, no need to check" version available to sort() (with
> the typechecking version available for general use/unoptimised sorting)?

They're not _just_ type-aware.  For example, the special function for
ints is specialized to integers that happen to fit in one (internal)
"digit", and the special function for strings is specialized to those
that happen to be stored in PyUnicode_1BYTE_KIND format.  Adding such
stuff to the public C API would be ... well, for a start, tedious ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-09 Thread Chris Barker

On Thu, Mar 9, 2017 at 2:04 AM, Barry Scott  wrote:

> It was not clear to me that you need to scan the list at the start to make
> sure its homogeneous. Given that the type check is so cheap will it
> slow the sort if you do the pointer check in the compare code? I am not
> suggesting you run rich compare full fat on each compare.
>

I think you have a point here.

IIUC, the current code makes no assumptions about type homogeneity. So it
must do something generic at each compare. Perhaps that generic thing (of
course, Elliot knows that is) does do a pointer compare, and then something
smart and fast for some built-ins. but there is still a few steps:

these are both the same type
what type are they
how do I compare them?
do the compare

These may well all be fast, but it's still a few steps, and have to be done
O(n*log(n)) times

IIUC, Elliot's patch does something like:

First go through the whole list and do:
  What is the type of the first item (only once)
  How do I compare that type (only once)
  Is everything else in the list the same type ( O(n) )

Then the actual sort:
 - do the compare ( O(n*log(n)) )

So there is one operation done n times, and one done n*log(n) times, rather
than 4 done n*log(n) times, if the constants are about the same, that saves
3n*log(n) - n operations, or -- if n is non-trivial, then n*3*log(n)
operations

so we go from 4*n*log(n) to n*log(n) about a 4 times speed up -- in theory.
with all the other complications of computer performance, only profiling
will tell you! (also, it's not 4 times on the whole sort -- just the
compare part -- you still need to shuffle the values around, which
presumably takes at last as long as each of the above operations)

(not sure about all four of those steps, it may be only three -- but still
a 3 time speed up)

Now Barry's idea:

Assume that the list is homogenous, and crap out if it turns out it's not:

Start off:
  What is the type of the first item (only once)
  How do I compare that type (only once)
Do the sort:
  Is the next item the correct type? O(n*log(n))
  Do the compare ( O(n*log(n)) )

So we now have 2 operations to be run O(n*log(n)) -- so not as good at only
one, but still better than the 3 or 4 of the fully general sort. And this
would have less impact on the non-homogenous case.

Though Elliot points out that this would be much harder to branch-predict,
etc... so maybe not.

Maybe worth trying out and profiling??

NOTE: I really have no idea what really goes in in that code -- so I may
have this all wrong...

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-09 Thread Tim Delaney

On 9 March 2017 at 21:04, Barry Scott  wrote:

> It was not clear to me that you need to scan the list at the start to make
> sure its homogeneous. Given that the type check is so cheap will it
> slow the sort if you do the pointer check in the compare code? I am not
> suggesting you run rich compare full fat on each compare.
>

Isn't there already always a scan of the iterable to build the keys array
for sorting (even if no key keyword param is specified)? In which case
adding the homogenity check there seems like it shouldn't add much overhead
at all (I have to say that I was surprised with 10+% reductions in speed in
some of the heterogenous TimSort tests for this reason).

And could specific richcompares be refactored so there was a "we really
know what the types are is, no need to check" version available to sort()
(with the typechecking version available for general use/unoptimised
sorting)?

Tim Delaney
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-09 Thread Elliot Gorokhovsky

On Thu, Mar 9, 2017 at 3:04 AM Barry Scott  wrote:

>
> So you do O(nlogn)*2 pointer compares with my suggestion it seems? Which
> is smaller the O(n) pointer checks?
>

Not sure what you mean here... pretty sure your inequality is backwards?

Look -- the point is, we already *do* the pointer checks during every
compare. That's the current implementation. I believe my benchmarks are
sufficient to prove that my patch is much faster than the current
implementation. Which makes sense, because we do one type check per object,
as opposed to something like log n per object, and we do them all at once,
which I do believe is faster than doing them across calls.

Now, you're not entirely wrong -- the current implementation doesn't do the
type checks as efficiently as it could. Specifically, it does them once in
PyObject_RichCompareBool, and then *again* in ob_type->tp_richcompare
(which it has to for safety: ob_type->tp_richcompare doesn't know the
arguments have already been type-checked!) You could totally define
optimized compares that do the type-checks more efficiently, and you would
see a benefit, just not nearly as extreme as the benefit I demonstrate.

Thanks again for your feedback!
Elliot
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-09 Thread Barry Scott


> On 8 Mar 2017, at 22:58, Elliot Gorokhovsky  
> wrote:
> 
> On Wed, Mar 8, 2017 at 2:14 PM Barry  > wrote:
> Can you assume that list of of type(list[0]) and use that type's optimised 
> sort?
> But in the optimised sort code check that the types are as required.
> If you hit an element that is not of the required type then fall back to the 
> unoptimised sort.
> 
> Well, how would you tell if you've hit an element that is not of the required 
> type? You'd have to check during every compare, right? And that's precisely 
> what we're trying to avoid!

What it seemed the trick for optimisation is is to compare the type pointer of 
an object to see if its the same as a type supported by the chosen optimised 
sort.

It was not clear to me that you need to scan the list at the start to make sure 
its homogeneous. Given that the type check is so cheap will it
slow the sort if you do the pointer check in the compare code? I am not 
suggesting you run rich compare full fat on each compare.

> The whole point of my patch is that we do O(nlogn) compares, but only have 
> O(n) elements, so it's much cheaper to do all the type checks in advance, and 
> in the very likely case that our list is homogeneous, switch to an optimized 
> special-case compare function.

So you do O(nlogn)*2 pointer compares with my suggestion it seems? Which is 
smaller the O(n) pointer checks?

> 
> Even when we only do O(n) compares, my patch is still much faster (see 
> benchmarks higher up in this thread). Why? Well, if you're doing the type 
> checks during the compares, you're doing them across different function 
> calls, with other stuff interspersed in between. So pipeline/branch 
> prediction/caching is less able to optimize away the overhead of the safety 
> checks (I don't know how CPUs work, but one of those is relevant here). With 
> the pre-sort check, it's all in a single iteration through the list, and 
> we're taking the same exact branch every time; it's much faster. Think 
> iterating over a matrix row-wise versus iterating column-wise (not exactly 
> relevant here since that's about cache, but that's the idea. Again, I don't 
> really know how CPUs work).

Provided the code is small I think both versions of the algorithm will benefit 
from cache and branch prediction.

> 
> So in short: no, we can't just check as we go, because that's what we're 
> already doing. But we can optimize significantly by checking in advance. I 
> mean, practically speaking, the advance check is basically free. The 
> compare-time checks, in sum, are not, both asymptotically and practically.

I not clear this is a true. But I have not read the code.

Barry___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-08 Thread Chris Barker

On Wed, Mar 8, 2017 at 2:58 PM, Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

>
> The whole point of my patch is that we do O(nlogn) compares, but only have
> O(n) elements, so it's much cheaper to do all the type checks in advance,
>


>  I mean, practically speaking, the advance check is basically free. The
> compare-time checks, in sum, are not, both asymptotically and practically.
>

hmm -- I know folks like to say that "the constants don't matter", but it
seems they do in this case:

without pre-checking:

O(nlogn)

with pre-checking If homogenous:

O(n) + O(nlogn)

so the pre-checking only adds if you ignore the constants..

But I'm assuming (particularly with locality and branch prediction and all
that included) the constant to type-check is much smaller than the constant
to compare two unknown types, so:

TC*n + KC*n*logn

vs

UC*n*logn

where:
TC -- Constant to type check
KC -- Constant known compare
UC -- Constant unknown type check

So if UC > TC/logn + KC

Then this optimization make sense.

If UC >KC and UC >>> TC, then this all works out.

But if n is HUGE, it may end up being slower (maybe more so with cache
locality???)

Which is why you need to profile all this carefully.

So far Elliott's experiments seem to show it works out well.

Which doesn't surprise me for built-ins like float and int that have a
native compare, but apparently also for more complex types? How about
custom classes?

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-08 Thread Elliot Gorokhovsky

On Wed, Mar 8, 2017 at 2:14 PM Barry  wrote:

> Can you assume that list of of type(list[0]) and use that type's optimised
> sort?
> But in the optimised sort code check that the types are as required.
> If you hit an element that is not of the required type then fall back to
> the unoptimised sort.
>

Well, how would you tell if you've hit an element that is not of the
required type? You'd have to check during every compare, right? And that's
precisely what we're trying to avoid!

The whole point of my patch is that we do O(nlogn) compares, but only have
O(n) elements, so it's much cheaper to do all the type checks in advance,
and in the very likely case that our list is homogeneous, switch to an
optimized special-case compare function.

Even when we only do O(n) compares, my patch is still much faster (see
benchmarks higher up in this thread). Why? Well, if you're doing the type
checks during the compares, you're doing them across different function
calls, with other stuff interspersed in between. So pipeline/branch
prediction/caching is less able to optimize away the overhead of the safety
checks (I don't know how CPUs work, but one of those is relevant here).
With the pre-sort check, it's all in a single iteration through the list,
and we're taking the same exact branch every time; it's much faster. Think
iterating over a matrix row-wise versus iterating column-wise (not exactly
relevant here since that's about cache, but that's the idea. Again, I don't
really know how CPUs work).

So in short: no, we can't just check as we go, because that's what we're
already doing. But we can optimize significantly by checking in advance. I
mean, practically speaking, the advance check is basically free. The
compare-time checks, in sum, are not, both asymptotically and practically.

Best
Elliot
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-08 Thread Erik


On 08/03/17 11:07, Steven D'Aprano wrote:

I mentioned earlier that I have code which has to track the type of list
items, and swaps to a different algorithm when the types are not all the
same.


Hmmm. Yes, I guess if the expensive version requires a lot of 
isinstance() messing or similar for each element then it could be better 
to have optimized versions for homogeneous lists of ints or strings etc.



A list.is_heterogeneous() method
could be implemented if it was necessary,


I would prefer to get the list item's type:

if mylist.__type_hint__ is float:


If you know the list is homogeneous then the item's type is 
"type(mylist[0])".


Also, having it be a function call gives an obvious place to put the 
transition from "unknown" to known state if the tri-state hint approach 
was taken. Otherwise, that would have to be hooked into the attribute 
access somehow.


That's for someone who wants to try implementing it to decide and 
propose though :)


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-08 Thread Steven D'Aprano

On Wed, Mar 08, 2017 at 01:20:19AM +, Erik wrote:

> >Part of the complexity here is that I'd like this flag to be available
> >to Python code, not just a hidden internal state of the list.
> 
> Out of interest, for what purpose? Generally, I thought Python code 
> should not need to worry about low-level optimisations such as this 
> (which are C-Python specific AIUI). 

I mentioned earlier that I have code which has to track the type of list 
items, and swaps to a different algorithm when the types are not all the 
same. In practice, I just run the "mixed types" algorithm regardless, 
because the cost of doing a scan of the items in pure Python is too 
expensive relative to the time saved. Moving some of the work into the C 
infrastructure might change that.

I'm not completely sure that this would, in fact, be useful to me, but 
I'd like the opportunity to experiment. I could try using a list 
subclass, but again, the cost of doing the type-checking in Python 
instead of C is (I think) prohibitive. Nevertheless, I understand that 
the burden of proving the usefulness of this is on me. (Or anyone else 
that wants to argue the case.)

> A list.is_heterogeneous() method 
> could be implemented if it was necessary, but how would that be used?

I would prefer to get the list item's type:

if mylist.__type_hint__ is float:
# run optimized float version
...
elif mylist.__type_hint__ is int:
...
else:
# unoptimized version

> >But also avoids bothering with an O(N) scan in some situations where
> >the list really is hetrogeneous. So there's both an opportunity cost and
> >a benefit.
> 
> O(N) is worst case.

It is also the best and average case.

Given a homogenous list of N items, for some arbitrary N between 0 and 
∞, you have to look at all N items before knowing that they're all the 
same type. So for the homogenous case, the best, worst and average are 
identically O(N).

Given a hetrogeneous list of N items where the first difference is 
found at index K, K can range from 1 through N-1. (By definition, it 
cannot be at index 0: you can only detect a difference in types after 
checking *two* items.) The worst case is that you don't find the 
difference until the last item, which is O(N). The best case is that you 
find the difference in position 1 and bail out early. On average, you 
will find the first difference halfway through the list, which makes it 
O(N/2) but that's just O(N). (Constant factors don't matter.)

If you want to call that O(1) for the best hetrogeneous case, I won't 
argue except to say that's rather against the spirit of Big Oh analysis 
in my opinion. I think it's more realistic to say its O(N) across all 
combinations of best/worst/average and homogeneous/hetrogeneous. But of 
course if you bail out early, the constant multiplier may differ.

> Most of the anecdotal evidence in this thread so far seems to suggest 
> that heterogeneous lists are not common. May or may not be true. 
> Empirically, for me, it is true. Who knows? (and there is the question).

That will strongly depend on where the data is coming from, but when it 
comes to sorting, randomly mixed types will usually fail since 
comparisons between different types are generally illegal:

py> [1, "a"].sort()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unorderable types: str() < int()

-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread Steven D'Aprano

On Tue, Mar 07, 2017 at 09:10:00PM +, Elliot Gorokhovsky wrote:

> I think the bigger problem, though, is that most list use does *not*
> involve sorting, 

There are other uses for a type hint apart from sorting. There may be 
optimized versions of other functions (sum, math.fsum, ...) and list 
methods (e.g. count, remove), anything that has to walk the list 
comparing items).

All very hypothetical at the moment, I admit it...

> so it would be a shame to impose the non-trivial overhead
> of type-checking on *all* list use.

You might be right, but my guess is that the overhead isn't quite as big 
as you may think, at least for some operations. The type-check itself is 
just a pointer compare, not an issubclass() operation. The types are 
either identical, or the list is hetrogeneous.

True, `mylist[i] = x` should be a quick assignment into an array of 
pointers, but there's a bounds check to ensure i is within the correct 
range. Other methods are even more expensive: `mylist[i:j] = [a...z]` 
has to move list items around, possibly allocate more memory, and update 
the list length, compared to that the scan of a...z is probably minimal. 
`mylist.append(x)` is *usually* a simple assignment into the array (plus 
updating the list length) but every now and again it triggers a resize, 
which is costly.

I don't think we can predict whether this is a nett gain or loss just 
from first principles.

> Anyway, my patch could always be a precursor to a more general optimization
> along these lines.

Indeed! Even if nothing else comes of this than your patch, thank you!

-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread Steven D'Aprano

On Tue, Mar 07, 2017 at 08:46:45PM +, Erik wrote:

> I don't think anyone has mentioned this yet, but FWIW I think the 'type 
> hint' may need to be tri-state: heterogeneous (NULL), homogeneous (the 
> pointer to the type structure) and also "unknown" (a sentinel value - 
> the address of a static char or something).

I thought about that and rejected it as an unnecessary complication. 
Hetrogeneous and unknown might as well be the same state: either way, 
you cannot use the homogeneous-type optimization.

Part of the complexity here is that I'd like this flag to be available 
to Python code, not just a hidden internal state of the list.

But my instinct for this could be wrong -- if anyone is interested to do 
some experiments on this, it might be that a three-state flag works out 
better in practice than two-states.

> Otherwise, a delete operation on the list would need to scan the list to 
> work out if it had changed from heterogeneous to homogeneous (unless it 
> was acceptable that once heterogeneous, a list is always considered 
> heterogeneous - i.e., delete always sets the hint to NULL).

In a later email, you corrected this: a delete operation need not touch 
the type-hint (except when the last item is deleted, at which point it 
resets to None/unknown.

With a three-state flag, you can make a three-way decision:

If the flag is Unknown:
- do a O(N) scan to determine whether the list is homogeneous 
  or hetrogeneous, then choose the optimized or unoptimized
  routine;

if the flag is Hetrogeneous:
- there is no point doing a scan, so always choose the 
  unoptimized routine;

if the flag is a type:
- the list is definitely homogeneous, so depending on the
  type, you may be able to choose an optimized rountine.

Compared to that, a two-state flag misses some opportunities to run 
the optimized routine:

- list contains [1, 2, 3, 4, "foo"] so the hint is set to None;
- delete the last item;
- list is now homogeneous but the hint is still None.

But also avoids bothering with an O(N) scan in some situations where 
the list really is hetrogeneous. So there's both an opportunity cost and 
a benefit.

There may be other opportunities to do the scan, such as when the 
underlying pre-allocated array resizes, so even with the two-state flag, 
"unknown" need not stay unknown forever.

> I'd prefer the sort optimization to be based on what my list contains 
> NOW, not on what it may have contained some time in the past, 

Remember, we're talking about opportunities for applying an optimization 
here, nothing more. You're not giving up anything: at worst, the 
ordinary, unoptimized routine will run and you're no worse off than you 
are today.

> so I'm not 
> a fan of the "once heterogeneous, always considered heterogeneous" 
> behaviour if it's cheap enough to avoid it.

It is not just a matter of the cost of tracking three states versus two. 
It is a matter of the complexity of the interface.

I suppose this could be reported to Python code as None, False or the 
type. Although, I don't know about you, but I know I'll never be able to 
remember whether None means Unknown and False means Hetrogeneous, or the 
other way around.

Ultimately, this is all very pie-in-the-sky unless somebody tests just 
how expensive this is and whether the benefit is worthwhile. That's not 
going to be me: I'm not able to hack on the list C code.

-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread David Mertz

On Tue, Mar 7, 2017 at 4:36 PM, Erik  wrote:

> * listextend() - this should do the right thing with the type hint when
>> extending one list with another.
>>
>
> * Several other methods ('contains', 'remove', 'count', 'index') also use
> PyObject_RichCompareBool(). They could also presumably benefit from the
> same optimisation (perhaps it's not all about sort() - perhaps this gives a
> little more weight to the idea).


Good point about list.extend().  I don't think __type_hint__ could help
with .__contains__() or .count() or .remove().  E.g.:

In [7]: lst = [1.0, 2.0, 1+0j, F(1,1)]
In [8]: from fractions import Fraction as F
In [9]: lst = [1.0, 2.0, 1+0j, F(1,1)]
In [10]: 1 in lst
Out[10]: True
In [11]: lst.count(1)
Out[11]: 3
In [12]: l.index(1)
Out[12]: 0


The list has absolutely nothing of the right type.  Yet it contains an
item, counts things that are equal, finds a position for an equal item.

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread Elliot Gorokhovsky

On Tue, Mar 7, 2017 at 1:47 PM Erik  wrote:

>
> I'd prefer the sort optimization to be based on what my list contains
> NOW, not on what it may have contained some time in the past, so I'm not
> a fan of the "once heterogeneous, always considered heterogeneous"
> behaviour if it's cheap enough to avoid it.
>

Sure. Dictionaries actually don't implement this, though: as soon as they
see a non-string key, they permanently switch to a heterogeneous state
(IIRC).

I think the bigger problem, though, is that most list use does *not*
involve sorting, so it would be a shame to impose the non-trivial overhead
of type-checking on *all* list use. With dictionaries, you have to
type-check inserts anyway (for equality testing), so it's less of a
problem.  But the fundamental list operations *don't* require type-checking
currently, so why add it in? In practice, the pre-sort check is *very*
cheap, and its cost is only imposed on sort usage.

Anyway, my patch could always be a precursor to a more general optimization
along these lines. I'm almost finished fixing the problem Tim identified
earlier in this thread; after that, it'll be ready for review!
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread Erik


On 07/03/17 20:46, Erik wrote:

(unless it
was acceptable that once heterogeneous, a list is always considered
heterogeneous - i.e., delete always sets the hint to NULL).


Rubbish. I meant that delete would not touch the hint at all.

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-06 Thread Chris Barker

On Sun, Mar 5, 2017 at 6:45 PM, Steven D'Aprano  wrote:

> I sometimes need to know if a list is homogenous, but unfortunately
> checking large lists for a common type in pure Python is quote slow.
>
> Here is a radical thought... why don't lists track their common type
> themselves? There's only a few methods which can add items:
>

For what it's worth, I suggested this a LONG time ago -- well before Python
ideas existed I thought a "homogenous sequence" could be rally useful
for all sorts of optimizations. (at the time I was writing C extensions,
and often  converting a lot of list to numpy arrays -- which ARE homogenous
sequences)

Anyway -- it was roundly rejected by Guido and others no one had any
interest in the idea.

But maybe now that there is a compelling use-case for the built in object
the time is right??

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-06 Thread Terry Reedy


On 3/6/2017 1:56 AM, Elliot Gorokhovsky wrote:


P.S. Is it OK if I close my current issue on the bug tracker and open a
new issue, where I'll post the revised patch? The writing on my current
issue uses the old, less-rigorous benchmarks, and I feel it would be
less confusing if I just made a new issue and posted the correct
benchmarks/discussion at the top. The current issue doesn't have many
comments, so not much would be lost by just linking to it from the new
issue. If this violates bug tracker etiquette, however, :) I'll just
post the revised patch on my current issue.


That would be normal.  Issues often get revised patches, sometimes with 
more severe changes than this.  Or people post competing patches.  Do 
reverence this thread, and quote Tim's approval in principle, if he did 
not post on the tracker.


--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 11:31 PM Tim Peters  wrote:

>
> The best approach is indeed to pass the function pointer to every
> location that needs it.  Note that a MergeState struct is already
> created per sort invocation, That isn't a file global for much the
> same reason.
>

Right. It's a real shame, because the patch as it stands right now is
extremely surgical, but there's clearly no way around it. There are some
functions that don't take in MergeState (e.g. gallop_left) but that call
ISLT/IFLT, so I think I'll just add a compare function pointer parameter to
all the function calls. I mean, the diff will be a lot hairier, which might
make the patch more difficult to review, but when it comes down to it the
code complexity won't really increase, so overall I don't think this is the
end of the world.

Thanks so much for pointing this out!

P.S. Is it OK if I close my current issue on the bug tracker and open a new
issue, where I'll post the revised patch? The writing on my current issue
uses the old, less-rigorous benchmarks, and I feel it would be less
confusing if I just made a new issue and posted the correct
benchmarks/discussion at the top. The current issue doesn't have many
comments, so not much would be lost by just linking to it from the new
issue. If this violates bug tracker etiquette, however, :) I'll just post
the revised patch on my current issue.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Chris Angelico

On Mon, Mar 6, 2017 at 5:52 PM, Tim Peters  wrote:
> [Chris Angelico ]
>> Arbitrary comparison functions let you do anything but whoa, I
>> cannot imagine any way that this would ever happen outside of "hey
>> look, here's how you can trigger a SystemError"!
>
> CPython is full of defensive code protecting against malicious crap.
> That's why it rarely crashes ;-)
>
> def __lt__(self, other):
> return self.size < other.size
>
> Looks harmless?  Can't tell!  For all we know, there are proxy
> objects, and other.__getattr__ invokes some elaborate library to open
> a socket in a new thread to fetch the value of `size` over a network.

Exactly. It's always fun to discover some nice tidy exception that
cleanly copes with a ridiculous situation.

def gen():
yield next(g)
g = gen()
next(g)

Fortunately in this case, the solution isn't to say "SystemError:
cannot create threads while sorting", but even if that were the case,
I can't imagine that much production code would be stopped by it.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Tim Peters

[Chris Angelico ]
> Arbitrary comparison functions let you do anything but whoa, I
> cannot imagine any way that this would ever happen outside of "hey
> look, here's how you can trigger a SystemError"!

CPython is full of defensive code protecting against malicious crap.
That's why it rarely crashes ;-)

def __lt__(self, other):
return self.size < other.size

Looks harmless?  Can't tell!  For all we know, there are proxy
objects, and other.__getattr__ invokes some elaborate library to open
a socket in a new thread to fetch the value of `size` over a network.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 11:39 PM INADA Naoki  wrote:

>
> I think there is another safe solution:  Gave up unsafe_object_compare.
>
> Compare function of long, float, and unicode must not call list.sort(),
> and must not release GIL.
> So all you need is, backup old compare_function before sort, and
> restore it after sort.
>

That would definitely work, but I think it's too much of a sacrifice for
these reasons:
(1) You would have to give up on optimizing tuples (because tuple compares
call PyObject_RichCompareBool, which could itself call arbitrary code).
That's a real shame, because sorting tuples is common (e.g., in DEAP, you
sort tuples of floats representing the fitnesses of individuals). The
alternative would be to only optimize tuples of long/string/float. However,
I think the code complexity of *that* would be greater than the, admittedly
messy, solution of passing in the compare everywhere it's needed.
(2) There are lots of types, e.g. bytes, that would fall through the cracks.
(3) Correct me if I'm wrong, but this would make us depend on GIL, right?
And we don't want to depend on that if we don't have to, right?

It seems to me that the only solution here is to go in and add a compare
function pointer to each function call. Extremely unfortunate, but
necessary.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Jelle Zijlstra

2017-03-05 22:08 GMT-08:00 Elliot Gorokhovsky :
> Another solution: check if there is more than one thread; if there is, then
> disable optimization. Is sorting in multithreaded programs common enough to
> warrant adding the complexity to deal with it?
>

I think using a global is unsafe even without multithreading, because
the compare function itself could end up doing list.sort() (it's
calling arbitrary Python code after all).

>
> On Sun, Mar 5, 2017 at 10:52 PM Elliot Gorokhovsky
>  wrote:
>>
>> On Sun, Mar 5, 2017 at 10:45 PM Elliot Gorokhovsky
>>  wrote:
>>>
>>>
>>> the problem is, how can we get a unique identifier for the thread in a
>>> platform-independent way? Any ideas?
>>
>>
>> Oh, I could probably just copy over code from threading.get_ident()... not
>> sure if the key-value table is a good solution, though.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 10:45 PM Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

>
> the problem is, how can we get a unique identifier for the thread in a
> platform-independent way? Any ideas?
>

Oh, I could probably just copy over code from threading.get_ident()... not
sure if the key-value table is a good solution, though.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 10:25 PM David Mertz  wrote:

>
> Could we make the file global a table of comparison functions and have
> each thread reference a position in the table? It would be fine if multiple
> threads happened to use the position of e.g. the int comparison, just so
> long as each chose the right slot in the table.
>

I don't think that would solve anything. The problem is that the pre-sort
check is carried out in the main sort function, but then lots of smaller
functions need to know which compare to use. Since the scope isn't shared,
then, how can we inform the smaller functions of which compare to use?

I solved this problem by just putting the compare function pointer in
global scope. Obviously, however, this is not thread-safe. So the question
is, how can the smaller functions be made aware of the results of the
pre-sort check? Your proposal would merely replace the problem of
communicating the function pointer with the problem of communicating the
appropriate index into the function table. The smaller functions wouldn't
know which index they're supposed to use unless you passed it in.

However, I believe the following *would* work:
Suppose it were possible to access an identifier unique to each thread,
such as that returned by gettid() on Linux. Then maintain a global table of
(unique_thread_id, compare_func) pairs, and simply have the ISLT macro
index into the table! A bit of a performance hit, sure, but hopefully not
that bad? Certainly better than passing it in every time... the problem is,
how can we get a unique identifier for the thread in a platform-independent
way? Any ideas?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 9:25 PM Tim Peters  wrote:

>
> Would someone please move the patch along?  I expect it's my fault it's
> languished so long, since I'm probably the natural person to review it, but
> I've been buried under other stuff.
>
> But the patch doesn't change anything about the sorting algorithm itself -
> even shallow knowledge of how timsort works is irrelevant.  It's just
> plugging in a different bottom-level object comparison _function_ when that
> appears valuable.
>
> I've said from the start that it's obvious (to me ;-) ) that it's an
> excellent tradeoff.  At worst it adds one simple (pre)pass over the list
> doing C-level pointer equality comparisons.  That's cheap.  The worst-case
> damage is obviously small, the best-case gain is obviously large, and the
> best cases are almost certainly far more common than the worst cases in
> most code.
>

Thank you so much for the support! Yes to all of those things!

>
> In reviewing my own code, after it was fiddled to work under Python 3
> there are no mixed-type lists that are ever sorted.  There are lists with
> complex objects, but whenever those are sorted there's a `key=` argument
> that reduces the problem to sorting ints or tuples of builtin scalar types.
>

I'm adding that quote to the next version my poster :)

>
> One subtle thing to look at:  thread safety.  IIRC, the patch plugged the
> comparison function into a file global.  That's obviously hosed if multiple
> threads sort different kinds of lists simultaneously.
>

Wow, that is a *very* good point. I never think about those kinds of
things, being a n00b, so thanks for catching that... I'll have to go in an
fix that. I'm not sure how, though, because the ISLT macro gets called in a
bunch of different functions. The only way I can think of to fix it would
be to pass the function pointer as an argument to *all* the functions that
use ISLT, which would be pretty messy. What do you think would be the
easiest fix?

Thanks!
Elliot
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Tim Peters

[Elliot Gorokhovsky ]
> (Summary of results: my patch at https://bugs.python.org/issue28685 makes
list.sort() 30-50%
> faster in common cases, and at most 1.5% slower in the uncommon worst
case.)
> ...

Would someone please move the patch along?  I expect it's my fault it's
languished so long, since I'm probably the natural person to review it, but
I've been buried under other stuff.

But the patch doesn't change anything about the sorting algorithm itself -
even shallow knowledge of how timsort works is irrelevant.  It's just
plugging in a different bottom-level object comparison _function_ when that
appears valuable.

I've said from the start that it's obvious (to me ;-) ) that it's an
excellent tradeoff.  At worst it adds one simple (pre)pass over the list
doing C-level pointer equality comparisons.  That's cheap.  The worst-case
damage is obviously small, the best-case gain is obviously large, and the
best cases are almost certainly far more common than the worst cases in
most code.

In reviewing my own code, after it was fiddled to work under Python 3 there
are no mixed-type lists that are ever sorted.  There are lists with complex
objects, but whenever those are sorted there's a `key=` argument that
reduces the problem to sorting ints or tuples of builtin scalar types.

I don't care about anyone else's code ;-)

One subtle thing to look at:  thread safety.  IIRC, the patch plugged the
comparison function into a file global.  That's obviously hosed if multiple
threads sort different kinds of lists simultaneously.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 9:12 PM MRAB  wrote:

>
> Although it's true that both programmers and Python might treat 10 as
> functionally identical to 10.0, in practice the numbers that are being
> added to the list probably come from some code that returns integers
> /or/ floats, rather than a mixture.
>

Yes, exactly. So we can see how the homogeneity assumption is a reasonable
one to make; unless you're defining custom compares (uncommon), I don't see
where you would ever be sorting a heterogeneous list except for the
int/float case.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread MRAB


On 2017-03-06 03:09, Chris Angelico wrote:

On Mon, Mar 6, 2017 at 2:03 PM, Elliot Gorokhovsky
 wrote:

On Sun, Mar 5, 2017 at 7:50 PM Chris Angelico  wrote:



I would be rather curious to know how frequently a list consists of
"numbers", but a mix of ints and floats. Does it happen a
lot in real-world code?



This is of course undecidable to verify statically, so we can't just crawl
PyPI... however, I would argue that using mixed float-int lists is
dangerous, and is more dangerous in Python 3 than in Python 2. So hopefully
this is not very common. However, even if 10% (surely a vast overestimate)
of sort calls are to mixed int-float lists, my patch would still yield a
significant savings on average.


I agree that it's dangerous, but it is still common for programmers
and Python alike to treat 10 as functionally identical to 10.0 -
although as to being more dangerous in Py3, that's much of a muchness
(for instance, the single-slash division operator in Py2 can
potentially truncate, but in Py3 it's always going to give you a
float). But, fair point. I very much doubt it's as high as 10%, so
yeah, that would be advantageous.

Also, the performance hit is so small, and even that is in the very
worst case (a homogeneous list with one different type at the end). I
like changes that make stuff run faster.

Although it's true that both programmers and Python might treat 10 as 
functionally identical to 10.0, in practice the numbers that are being 
added to the list probably come from some code that returns integers 
/or/ floats, rather than a mixture.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Nick Timkovich

I see the benchmarks, and while I assume the asymptotic complexity is the
same, is there a longer "start-up" time your optimizations need? Do you
have benchmarks where you sort 10, 100...10**6 items to show that beyond
the types you're sorting, you're not amortizing any increased overhead out
to oblivion?

On Sun, Mar 5, 2017 at 9:24 PM, Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

> On Sun, Mar 5, 2017 at 8:20 PM David Mertz  wrote:
>
>>
>> B. I think a very large percentage of lists are heterogeneous.  But most
>> of the time when they are, it's not because there are several different
>> numeric types but rather because a list collects some sort of custom
>> objects.  Maybe those even all share some ancestor, but the point is they
>> are user/library defined classes that won't have a fast comparison like
>> ints or floats do.
>>
>
> As long as the ob_type for all the objects is the same, my patch will sort
> significantly faster, as it replaces PyObject_RichCompareBool with
> ob_type->tp_richcompare. It doesn't matter if your list is builtin-types or
> not, as long as the types are all the same, you get a speedup (though it's
> greater for built-in types).
>
> Also, I actually wrote a crawler to see how many PyPI packages implement a
> custom compare function (like you would have to for user-defined types).
> The answer is less than 6%. Code: https://github.com/embg/
> python-fast-listsort/tree/master/crawler
>
>
>> B (part 2): I haven't actually read his patch, but I'm assuming that
>> Elliot's approach can start scanning the list, and as soon as it find a
>> complex/custom object at index 0 ignore the rest of the scan.  So for that
>> case, there is very little harm in his linear scan (over one item).
>>
>
> Yup, the pre-sort check breaks out of the loop as soon as it sees
> heterogeneity. So unless your float is at the end of the whole list of ints
> (as in my worst-case benchmark), the cost is basically 0.
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 8:23 PM David Mertz  wrote:

>
> But also, it's not just the number of list objects that are sorted, but
> how often it's done.  I could have a 10,000 line program with only one call
> to `my_list.sort()` in it... but that one line is something that is called
> inside an inner loop.  This feels like it really needs profiling not just
> static analysis.
>

Totally -- but what I mean is if you look at the performance benchmark
suite, for example, most of the benchmarks do not have the string "sort" in
their source code (IIRC). I think that most applications don't spend most
of their time sorting. That's not to say sorting isn't important -- I
wouldn't have written my patch if I didn't think it is. I'm just saying it
probably isn't a good idea to penalize *all* list use just to benefit the
minority of list use that involves sorting.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 8:20 PM David Mertz  wrote:

>
> B. I think a very large percentage of lists are heterogeneous.  But most
> of the time when they are, it's not because there are several different
> numeric types but rather because a list collects some sort of custom
> objects.  Maybe those even all share some ancestor, but the point is they
> are user/library defined classes that won't have a fast comparison like
> ints or floats do.
>

As long as the ob_type for all the objects is the same, my patch will sort
significantly faster, as it replaces PyObject_RichCompareBool with
ob_type->tp_richcompare. It doesn't matter if your list is builtin-types or
not, as long as the types are all the same, you get a speedup (though it's
greater for built-in types).

Also, I actually wrote a crawler to see how many PyPI packages implement a
custom compare function (like you would have to for user-defined types).
The answer is less than 6%. Code:
https://github.com/embg/python-fast-listsort/tree/master/crawler

> B (part 2): I haven't actually read his patch, but I'm assuming that
> Elliot's approach can start scanning the list, and as soon as it find a
> complex/custom object at index 0 ignore the rest of the scan.  So for that
> case, there is very little harm in his linear scan (over one item).
>

Yup, the pre-sort check breaks out of the loop as soon as it sees
heterogeneity. So unless your float is at the end of the whole list of ints
(as in my worst-case benchmark), the cost is basically 0.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 8:10 PM Chris Angelico  wrote:

>
> Also, the performance hit is so small, and even that is in the very
> worst case (a homogeneous list with one different type at the end). I
> like changes that make stuff run faster.
>

I agree.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread David Mertz

In my experience and/or estimation... well two things:

A. Mixtures of floats/ints seem a lot more common than the 10% being thrown
around.  I don't even consider that bad practice currently (it might become
bad practice if Elliot's patch is used, since keeping elements float-only
would allow much faster sorting).

B. I think a very large percentage of lists are heterogeneous.  But most of
the time when they are, it's not because there are several different
numeric types but rather because a list collects some sort of custom
objects.  Maybe those even all share some ancestor, but the point is they
are user/library defined classes that won't have a fast comparison like
ints or floats do.

B (part 2): I haven't actually read his patch, but I'm assuming that
Elliot's approach can start scanning the list, and as soon as it find a
complex/custom object at index 0 ignore the rest of the scan.  So for that
case, there is very little harm in his linear scan (over one item).

On Sun, Mar 5, 2017 at 7:09 PM, Chris Angelico  wrote:

> On Mon, Mar 6, 2017 at 2:03 PM, Elliot Gorokhovsky
>  wrote:
> > On Sun, Mar 5, 2017 at 7:50 PM Chris Angelico  wrote:
> >>
> >>
> >> I would be rather curious to know how frequently a list consists of
> >> "numbers", but a mix of ints and floats. Does it happen a
> >> lot in real-world code?
> >>
> >
> > This is of course undecidable to verify statically, so we can't just
> crawl
> > PyPI... however, I would argue that using mixed float-int lists is
> > dangerous, and is more dangerous in Python 3 than in Python 2. So
> hopefully
> > this is not very common. However, even if 10% (surely a vast
> overestimate)
> > of sort calls are to mixed int-float lists, my patch would still yield a
> > significant savings on average.
>
> I agree that it's dangerous, but it is still common for programmers
> and Python alike to treat 10 as functionally identical to 10.0 -
> although as to being more dangerous in Py3, that's much of a muchness
> (for instance, the single-slash division operator in Py2 can
> potentially truncate, but in Py3 it's always going to give you a
> float). But, fair point. I very much doubt it's as high as 10%, so
> yeah, that would be advantageous.
>
> Also, the performance hit is so small, and even that is in the very
> worst case (a homogeneous list with one different type at the end). I
> like changes that make stuff run faster.
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread David Mertz

On Sun, Mar 5, 2017 at 6:45 PM, Steven D'Aprano  wrote:

> Here is a radical thought... why don't lists track their common type
> themselves? There's only a few methods which can add items:
>

I had exactly the same thought.  Lists would need to grow a new attribute,
of course. I'm not sure how that would affect the object layout and word
boundaries.  But there might be free space for another attribute slot.

The real question is whether doing this is a win.  On each append/mutation
operation we would need to do a comparison to the __type_hint__ (assuming
Steven's spelling of the attribute).  That's not free.  Balancing that,
however, when we actually *did* a sort, it would be O(1) to tell if it was
homogeneous (and also the actual type if yes) rather than O(N).

This question isn't really subject to microbenchmarks though.  If we added
__type_hint__ as None/type-object and added those comparisons to it on
.insert()/.append()/etc, then we would be slower by some increment while
all we were doing was adding things.  There could only be a win when the
list is sorted (but a big win for that case).

In real world code, how often are lists sorted? Also I'm not really
confident that Elliot's estimates of 95% of lists being homogeneous holds,
but the speedup he proposes would seem to win even if that percentage is a
lot lower than 95%.  If only 10% of lists in real world code ever get
`my_list.sort()` called on them, Steven's idea is probably not good.  If
50% of lists do, it probably is.  But then, it depends just how *often*
lists that get sorted are re-sorted too.

Yours, David...

P.S. I think that given that we can .append() then delete items to make a
heterogenous list become homogeneous again, Elliot's idea is somewhat
orthogonal to Steven's.  That is, even if we had .__type_hint__ on lists,
it might be a "false None" and it could still be worth doing Elliot's
linear scan anyway.  On the other hand, the None-ness on non-empty lists
might be a good enough predictor of heterogeneity in real world code that
the linear scan would almost always be a waste.  I do not know without
benchmarks against real codebases.

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 7:50 PM Chris Angelico  wrote:

>
> I would be rather curious to know how frequently a list consists of
> "numbers", but a mix of ints and floats. Does it happen a
> lot in real-world code?
>
>
This is of course undecidable to verify statically, so we can't just crawl
PyPI... however, I would argue that using mixed float-int lists is
dangerous, and is more dangerous in Python 3 than in Python 2. So hopefully
this is not very common. However, even if 10% (surely a vast overestimate)
of sort calls are to mixed int-float lists, my patch would still yield a
significant savings on average.

(Apologies if this has already been mentioned; I've been skimming the
> thread, not reading it in detail.)


It hasn't been mentioned in this thread :)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

This is exactly what I would have done if I didn't think the changes would
be too large-scale to ever be accepted (at least if proposed by someone
without any experience). This is also what my friend suggested when I
explained my project to him. In fact, this is exactly what dictionaries do:
at each insert, they check whether a string is being inserted. If yes, then
stay in a "string" state, if no, then switch to your None state.

Again, the only reason I did it differently is because I wanted to be
realistic about what changes I could get accepted. My patch does everything
inside the list sort function, and doesn't modify anything outside of that.
So it's presumably much easier to review. I mean, I submitted my patch in
November, and it still hasn't been accepted :) so presumably something as
large-scale as this would have very little chance.

One more thing: changing mutating methods to check type would make them
non-trivially slower; if you aren't resizing the list, appending should
really just be as simple as storing a pointer and incrementing a counter.
If we have to check type, that turns two simple things into three simple
things, which could have significant performance implications. Most
applications of lists *don't* care about type homogeneity (i.e. iterating
through the list or whatever), so one would have to justify imposing this
cost on all list use just to optimize for the special cases where you *do*
care. I'm not sure how one would go about checking if that trade-off is a
good one or not: what is the distribution of list use across all Python
code? It's not something you can really check statically (it's undecidable
if you're being pedantic). With my patch, it's easy: if you aren't sorting,
you don't have to pay anything. If you are sorting, you either win or lose,
based on whether or not you're sorting type-homogeneous data or not (which
you basically always are).

If anything, I think my patch could be a starting point for later
optimization along these lines, depending on whether the problem raised in
the previous paragraph can be addressed. What do you think?

On Sun, Mar 5, 2017 at 7:47 PM Steven D'Aprano  wrote:

On Sun, Mar 05, 2017 at 07:19:43AM +, Elliot Gorokhovsky wrote:

> You may remember seeing some messages on here about optimizing list.sort()
> by exploiting type-homogeneity: since comparing apples and oranges is
> uncommon (though possible, i.e. float to int), it pays off to check if the
> list is type-homogeneous

I sometimes need to know if a list is homogenous, but unfortunately
checking large lists for a common type in pure Python is quote slow.

Here is a radical thought... why don't lists track their common type
themselves? There's only a few methods which can add items:

- append
- extend
- insert
- __setitem__

Suppose we gave lists a read-only attrribute, __type_hint__, which
returns None for hetrogeneous lists and the type for homogeneous lists.
Adding
an item to the list does as follows:

- if the list is empty, adding an item sets __type_hint__ to type(item);
- if the list is not empty, adding an item tests whether type(item)
  is identical to (not a subclass) of __type_hint__, and if not, sets
  __type_hint__ to None;
- removing an item doesn't change the __type_hint__ unless the list
  becomes empty, in which case it is reset to None;
- if the internal allocated space of the list shrinks, that triggers
  a recalculation of the __type_hint__ if it is currently None.

(There's no need to recalculate the hint if it is not None.)

Optional: doing a list.sort() could also recalculate the hint.

The effect will be:

- if __type_hint__ is a type object, then you can be sure that
  the list is homogeneous;

- if the __type_hint__ is None, then it might still be
  homogeneous, but it isn't safe to assume so.

Not only could sorting take advantage of the type hint without needing
to do a separate O(N) scan of the list, but so could other code. I know
I would be interested in using this. I have a fair amount of code that
has to track the type of any items seen in a list, and swap to a "type
agnostic but slow" version if the list is not homogeneous. I could
probably replace that with some variation of:

if thelist.__type_hint__ is None:
process_slow(thelist)
else:
process_fast(thelist)

At the very least, I'd be interested in experimenting with this.

Thoughts?

--
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Chris Angelico

On Mon, Mar 6, 2017 at 12:53 PM, Elliot Gorokhovsky
 wrote:
> (1) lists are idiomatically type-homogeneous. To quote the Python
> documentation, "Lists are mutable, and *their elements are usually
> homogeneous* and are accessed by iterating over the list".
> (2) it's not very common to have to "compare apples and oranges". While it's
> technically possible to define comparison between any two types you like,
> and in practice one sometimes compares e.g. ints and floats, in practice
> it's pretty safe to assume the lists you're sorting are going to be
> type-homogeneous 95% or 99% of the time.

I would be rather curious to know how frequently a list consists of
"numbers", but a mix of ints and floats. From the point of view of a
list's purpose, they're all numbers (point 1 satisfied), and they're
all comparable (point 2 satisfied), but from the POV of your patch,
it's heterogeneous and suffers a performance penalty. Does it happen a
lot in real-world code?

(Apologies if this has already been mentioned; I've been skimming the
thread, not reading it in detail.)

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Steven D'Aprano

On Sun, Mar 05, 2017 at 07:19:43AM +, Elliot Gorokhovsky wrote:

> You may remember seeing some messages on here about optimizing list.sort()
> by exploiting type-homogeneity: since comparing apples and oranges is
> uncommon (though possible, i.e. float to int), it pays off to check if the
> list is type-homogeneous 

I sometimes need to know if a list is homogenous, but unfortunately 
checking large lists for a common type in pure Python is quote slow.

Here is a radical thought... why don't lists track their common type 
themselves? There's only a few methods which can add items:

- append
- extend
- insert
- __setitem__

Suppose we gave lists a read-only attrribute, __type_hint__, which 
returns None for hetrogeneous lists and the type for homogeneous lists. Adding 
an item to the list does as follows:

- if the list is empty, adding an item sets __type_hint__ to type(item);
- if the list is not empty, adding an item tests whether type(item) 
  is identical to (not a subclass) of __type_hint__, and if not, sets
  __type_hint__ to None;
- removing an item doesn't change the __type_hint__ unless the list 
  becomes empty, in which case it is reset to None;
- if the internal allocated space of the list shrinks, that triggers
  a recalculation of the __type_hint__ if it is currently None.

(There's no need to recalculate the hint if it is not None.)

Optional: doing a list.sort() could also recalculate the hint.

The effect will be:

- if __type_hint__ is a type object, then you can be sure that 
  the list is homogeneous;

- if the __type_hint__ is None, then it might still be 
  homogeneous, but it isn't safe to assume so.

Not only could sorting take advantage of the type hint without needing 
to do a separate O(N) scan of the list, but so could other code. I know 
I would be interested in using this. I have a fair amount of code that 
has to track the type of any items seen in a list, and swap to a "type 
agnostic but slow" version if the list is not homogeneous. I could 
probably replace that with some variation of:

if thelist.__type_hint__ is None:
process_slow(thelist)
else:
process_fast(thelist)

At the very least, I'd be interested in experimenting with this.

Thoughts?

-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 6:30 PM INADA Naoki  wrote:

> Good job.  I'll read your work later.
>

Thanks!

So please don't use string-key dict specialization to rationalize
> list-sort specialization.
>

I'm not quite doing that: I agree that dict is a special case because of
internal use. I'm just saying that I'm not the first person to try to
exploit type homogeneity in data structures in CPython. However, my
justification is independent of dict specialization. Namely, it's the
following two considerations:

(1) lists are idiomatically type-homogeneous. To quote the Python
documentation, "Lists are mutable, and *their elements are usually
homogeneous* and are accessed by iterating over the list".
(2) it's not very common to have to "compare apples and oranges". While
it's technically possible to define comparison between any two types you
like, and in practice one sometimes compares e.g. ints and floats, in
practice it's pretty safe to assume the lists you're sorting are going to
be type-homogeneous 95% or 99% of the time.

I feel 2 (int, string) or 3 (int, string, tuple) special case may be worth
> enough if code is clean and benefit seems good.
>

I agree -- I only optimized for int, string, float, and tuple. However, my
code optimizes sorting for *any* type-homogeneous list as well, by simply
replacing PyObject_RichCompareBool with ob_type->richcompare, saving the
method lookup time. This is what gives the speedups for non-latin strings
and bigints in the benchmarks I shared in my original post: I don't have a
special case compare function for them, but by setting the compare function
pointer equal to ob_type->richcompare, I can at least save a little bit of
method lookup time.

In terms of clean code, the special-case compare functions include
assertions in #ifdef Py_DEBUG blocks that list all the requirements
necessary to make them safe. The pre-sort check then verifies the
assumptions for whichever optimized compare is going to be used. So all
that's necessary to verify correctness is to convince oneself that (1) the
assertions are sufficient and (2) the pre-sort check will never use a
compare function for which it cannot prove the assertions. These two points
are pretty easy to check:
(1) Just plug in the assertions in the original code, e.g. unicode_compare
in Objects/unicodeobject.c. Then you have a bunch of if(1) and if(0)
blocks, and if you delete the if(0) blocks you'll get exactly what I have
in the patch.
(2) The pre-sort check code is very simple, and well commented.
I would add further that this patch only actually adds one line of code to
listsort: assign_compare_function(lo.keys, saved_ob_size). The rest of the
patch is just function definitions, and replacing PyObject_RichCompareBool
with a function pointer (in the macro for ISLT(X,Y)).

Anyway, thanks in advance for your time!
Elliot
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread INADA Naoki

Good job.  I'll read your work later.

> relatively simple optimization. I would also add that Python dictionaries
> already implement this optimization: they start out optimizing based on the
> assumption that they'll only be seeing string keys, checking to make sure
> that assumption holds as they go. If they see a non-string key, they
> permanently switch over to the general implementation.

Please notice dict is very very special.
Since Python is dynamic language, name resolution is done in runtime,
not compile time.
And most name resolution cause lookup string key from dict.

So please don't use string-key dict specialization to rationalize
list-sort specialization.
Each specialization should be considered carefully about balance between
maintenance cost and benefit.

I feel 2 (int, string) or 3 (int, string, tuple) special case may be
worth enough if
code is clean and benefit seems good.
But more complex cases can be done in third party libraries, like numpy.

In third party library, you can use more powerful optimization like nan-boxing.
In Python builtin list, this is hard because we guarantee object
identity is not changed.

Regards,
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread Elliot Gorokhovsky

On Sun, Mar 5, 2017 at 10:51 AM David Mertz  wrote:

> Thanks for the hard work, this looks very promising.
>

Thank you!

> Real world data tends to be mostly-sorted.  So it would be useful for your
> benchmarks to include:
>
> A) Performance on completely sorted data
>   i) Of homogenous type
>   ii) Of mixed type
> B) Performance on mostly-sorted data
>   i) Of homogenous type
>   ii) Of mixed type
>

You are entirely correct, as the benchmarks below demonstrate. I used the
benchmark lists from Objects/listsort.txt, which are:

  \sort: descending data
  /sort: ascending data
  3sort: ascending, then 3 random exchanges
  +sort: ascending, then 10 random at the end
  %sort: ascending, then randomly replace 1% of elements w/ random
values
  ~sort: many duplicates
  =sort: all equal

My results are below (the script can be found at
https://github.com/embg/python-fastsort-benchmark/blob/master/bench-structured.py
):
Homogeneous ([int]):
\sort: 54.6%
/sort: 56.5%
3sort: 53.5%
+sort: 55.3%
%sort: 52.4%
~sort: 48.0%
=sort: 45.2%

Heterogeneous ([int]*n + [0.0]):
\sort: -17.2%
/sort: -19.8%
3sort: -18.0%
+sort: -18.8%
%sort: -10.0%
~sort: -2.1%
=sort: -21.3%

As you can see, because there's a lot less non-comparison overhead in the
structured lists, the impact of the optimization is much greater, both in
performance gain and in worst-case cost. However, I would argue that these
data do not invalidate the utility of my patch: the probability of
encountering a type-heterogeneous list is certainly less than 5% in
practice. So the expected savings, even for structured lists, is something
like (5%)(-20%) + (95%)(50%) = 46.5%. And, of course, not *all* the lists
one encounters in practice are structured; certainly not *this* structured.
So, overall, I would say the numbers above are extremely encouraging.
Thanks for pointing out the need for this benchmark, though!

Elliot
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-05 Thread David Mertz

On Sat, Mar 4, 2017 at 11:19 PM, Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

> (Summary of results: my patch at https://bugs.python.org/issue28685 makes
> list.sort() 30-50% faster in common cases, and at most 1.5% slower in the
> uncommon worst case.)
>

Thanks for the hard work, this looks very promising.

> While listsort.txt talks about benchmarking different kinds of structured
> lists, instead of just random lists, the results here would hold in those
> cases just as well, because this makes individual comparisons cheaper,
> instead of reducing the number of comparisons based on structure).
>

This point seems wrong on its face.  The main point of Timsort is to avoid
comparisons in mostly-sorted lists.  In makes a lot of difference whether
the initial data is random or mostly-sorted.

I presume the type homogeneity check takes a fixed O(N) time to perform.
In the random case, the sort will take O(N * log N) comparisons.  As a
multiplier, naive comparisons are clearly more expensive than type check.
But whether that type checking overhead is worthwhile obviously depends on
the number of comparisons, which might be O(N) if the data is sorted.

Real world data tends to be mostly-sorted.  So it would be useful for your
benchmarks to include:

A) Performance on completely sorted data
  i) Of homogenous type
  ii) Of mixed type
B) Performance on mostly-sorted data
  i) Of homogenous type
  ii) Of mixed type

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

42 matches

Mail list logo