Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-14 Thread Franklin? Lee
On Mon, Oct 10, 2016 at 11:29 PM, Elliot Gorokhovsky
 wrote:
>> Note that when Python's current sort was adopted in Java, they still kept
>> a quicksort variant for "unboxed" builtin types.  The adaptive merge sort
>> incurs many overheads that often cost more than they save unless comparisons
>> are in fact very expensive compared to the cost of pointer copying (and in
>> Java comparison of unboxed types is cheap).  Indeed, for native numeric
>> types, where comparison is dirt cheap, quicksort generally runs faster than
>> mergesort despite that the former does _more_ comparisons (because mergesort
>> does so much more pointer-copying).
>
>
> Ya, I think this may be a good approach for floats: if the list is all
> floats, just copy all the floats into a seperate array, use the standard
> library quicksort, and then construct a sorted PyObject* array. Like maybe
> set up a struct { PyObject* payload, float key } type of deal. This wouldn't
> work for strings (unicode is scary), and probably not for ints (one would
> have to check that all the ints are within C long bounds). Though on the
> other hand perhaps this would be too expensive?

I happened onto a page talking about float radix sort, and thought of
this thread.

Here it is: http://stereopsis.com/radix.html
The author claimed an 8x speedup, though the test was done nearly
fifteen years ago.

I was unsure about posting publicly, because it's not as if an even
faster float sort would help decide whether specialized sorts are
worth adding to CPython. I'm posting for history.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Greg Ewing

Elliot Gorokhovsky wrote:
if the list is all 
floats, just copy all the floats into a seperate array, use the standard 
library quicksort, and then construct a sorted PyObject* array.


My question would be whether sorting list of just floats
(or where the keys are just floats) is common enough to
be worth doing this.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Jonathan Goble
On Mon, Oct 10, 2016 at 11:30 PM Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

> - I expect tuples will also be worth specializing (complex sort keys are
> often implemented as tuples).
>
> I'm not sure what you mean here... I'm looking at the types of lo.keys,
> not of saved_ob_item (I think I said that earlier in this thread by mistake
> actually). So if someone is passing tuples and using itemgetter to extract
> ints or strings or whatever, the current code will work fine; lo.keys will
> be scalar types. Unless I misunderstand you here. I mean, when would
> lo.keys actually be tuples?
>

If someone wanted to sort, e.g., a table (likely a list of tuples) by
multiple columns at once, they might pass the key function as
`itemgetter(3, 4, 5)`, meaning to sort by "column" (actually item) 3, then
columns 4 and then 5 as tiebreakers. This itemgetter will return a new
tuple of three items, that tuple being the key to sort by. Since tuples
sort by the first different item, in this theoretical example the result of
sort() will be exactly what the user wanted: a table sorted by three
columns at once.

A practical example of such a use case is sorting by last name first and
then by first name where two people have the same last name. Assuming a
list of dicts in this case, the key function passed to sort() would simply
be `itemgetter('lastname", "firstname")`, which returns a tuple of two
items to use as the key.

So yes, there are perfectly valid use cases for tuples as keys.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky
It would still be stable. You would copy over {{x,1.0},{y,1.0},{x,1.0}},
and as long as a stable sort is used you would get out the same array,
using the cmp function left->key < right->key.  Then you would go in order,
copying back [x,y,x].

On Mon, Oct 10, 2016 at 9:49 PM Chris Angelico  wrote:

> On Tue, Oct 11, 2016 at 2:41 PM, Elliot Gorokhovsky
>  wrote:
> > Oh no, the idea here is just you would copy over the floats associated
> with
> > the PyObject* and keep them in an array of such structs, so that we know
> > which PyObject* are associated with which floats. Then after the standard
> > library quicksort sorts them you would copy the PyObject* into the list.
> So
> > you sort the PyObject* keyed by the floats. Anyway, I think the copying
> back
> > and forth would probably be too expensive, it's just an idea.
>
> It also wouldn't work if you have more than one object with the same value.
>
> >>> x = 1.0
> >>> y = 2.0/2
> >>> x is y
> False
> >>> l = [x, y, x]
> >>> l.sort()
> >>> assert l[0] is x
> >>> assert l[1] is y
> >>> assert l[2] is x
>
> Python's sort is stable, so the three elements of the list (being all
> equal) must remain in the same order.
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Chris Angelico
On Tue, Oct 11, 2016 at 2:41 PM, Elliot Gorokhovsky
 wrote:
> Oh no, the idea here is just you would copy over the floats associated with
> the PyObject* and keep them in an array of such structs, so that we know
> which PyObject* are associated with which floats. Then after the standard
> library quicksort sorts them you would copy the PyObject* into the list. So
> you sort the PyObject* keyed by the floats. Anyway, I think the copying back
> and forth would probably be too expensive, it's just an idea.

It also wouldn't work if you have more than one object with the same value.

>>> x = 1.0
>>> y = 2.0/2
>>> x is y
False
>>> l = [x, y, x]
>>> l.sort()
>>> assert l[0] is x
>>> assert l[1] is y
>>> assert l[2] is x

Python's sort is stable, so the three elements of the list (being all
equal) must remain in the same order.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky
Oh no, the idea here is just you would copy over the floats associated with
the PyObject* and keep them in an array of such structs, so that we know
which PyObject* are associated with which floats. Then after the standard
library quicksort sorts them you would copy the PyObject* into the list. So
you sort the PyObject* keyed by the floats. Anyway, I think the copying
back and forth would probably be too expensive, it's just an idea. Also, I
apologize for the formatting of my last email, I didn't realize Inbox would
mess up the quoting like that. I'll ensure I use plain-text quotes from now
on.

On Mon, Oct 10, 2016 at 9:38 PM Chris Angelico  wrote:

> On Tue, Oct 11, 2016 at 2:29 PM, Elliot Gorokhovsky
>  wrote:
> > Ya, I think this may be a good approach for floats: if the list is all
> > floats, just copy all the floats into a seperate array, use the standard
> > library quicksort, and then construct a sorted PyObject* array. Like
> maybe
> > set up a struct { PyObject* payload, float key } type of deal.
>
> Not quite sure what you mean here. What is payload, what is key? Are
> you implying that the original float objects could be destroyed and
> replaced with others of equal value? Python (unlike insurance claims)
> guarantees that you get back the exact same object as you started
> with.
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Chris Angelico
On Tue, Oct 11, 2016 at 2:29 PM, Elliot Gorokhovsky
 wrote:
> Ya, I think this may be a good approach for floats: if the list is all
> floats, just copy all the floats into a seperate array, use the standard
> library quicksort, and then construct a sorted PyObject* array. Like maybe
> set up a struct { PyObject* payload, float key } type of deal.

Not quite sure what you mean here. What is payload, what is key? Are
you implying that the original float objects could be destroyed and
replaced with others of equal value? Python (unlike insurance claims)
guarantees that you get back the exact same object as you started
with.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/