Re: [Cython] optimizing list.pop

2009-11-03 Thread Robert Bradshaw
On Nov 3, 2009, at 2:37 PM, Lisandro Dalcin wrote:

> On Tue, Nov 3, 2009 at 4:20 PM, Robert Bradshaw
>  wrote:
>> I concede that we're partially copying the (half-line) list-shrink  
>> trigger
>> algorithm here, but if that changed nothing would break. It's not  
>> like we're
>> using any secret APIs.
>>
>
> So L->allocated is public API? If you can get rid of that, I think no
> one would object your patch...

Well, it's clearly documented in listobject.h, and we didn't have to  
import any special headers to get it (like we do for frame objects and  
exception handling). There's nothing clean and simple to use instead,  
at least not without giving up nearly all the potential savings.

>> For the no argument case, there really is hardly any code  
>> duplication. Also,
>> we don't do any reallocation or anything, we just note that if the  
>> number of
>> items get small enough we dispatch to the generic case. I think the
>> resulting 5x speedup is worth it.
>>
>>> Why not use PyList_GET_ITEM(L) + Py_INCREF() + PyList_SetSlice() ??
>>> This way, you could easily support "L.pop(index)"...
>>
>> Have you seen the code for PyList_SetSlice? It's way more generic  
>> (and
>> complicated) than is needed in this case. And if one were to use  
>> the methods
>> mentioned above, one would still need all the bounds checking code  
>> that's
>> there (the only savings would be calling PyList_SetSlice rather  
>> than having
>> a manual loop, at the expense of extra memory management). Again,  
>> I'd argue
>> it's worth an 10x speedup.
>>
>
> Robert, "premature optimization is the root of all evil"... You are
> trying to be faster than CPython!! list.pop() is implemented more or
> less with PyList_SetSlice !!

You say this as if it's a bad thing to be faster than CPython :)

>> Ideally there should be a PyList_Pop method in the C API, and more  
>> efficient
>> than the one that's there, but that's not the case.
>
> Yes, of course... And if it were, it would likely be implemented with
> PyList_SetSlice, as the list.pop() method is currently is... Or core
> CPython devs are truly lazy, or you Robert are wasting valuable time
> in micro-optimizations ...

Well, at least I spent less time on that micro-optimization than  
answering these emails :). (Not that I think that either is a waste.)  
My hunch is that since there was no (natural) way to pop without a  
Python function call, and the attribute lookup and argument unpacking  
took more time each than the PyList_SetSlice code overhead, there was  
no incentive for the core CPython devs to put any time into optimizing  
this, if it even crossed their radar.

- Robert

___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Lisandro Dalcin
On Tue, Nov 3, 2009 at 4:20 PM, Robert Bradshaw
 wrote:
> On Nov 3, 2009, at 7:08 AM, Lisandro Dalcin wrote:
>
>> On Tue, Nov 3, 2009 at 12:28 PM, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> Stefan Behnel wrote:

 Robert Bradshaw, 03.11.2009 10:22:

> On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:
>
>> I think Cython could generate a custom, inline function
>> implementing l.pop([n]) and fast dispatch to it if the object is
>> exactly a list ...
>>
> http://hg.cython.org/cython-devel/rev/2e3dda4a7d23
>

 Hmm, is this really worth the code duplication with CPython? I mean, we
 are
 partly copying the list-shrink trigger algorithm here, which I would
 prefer
 to consider an implementation detail of the list type.

>>> I second this, this is the kind of code I don't like to see Cython
>>> output. At the very least, one should #IFDEF it to only kick in with
>>> known good Python versions...it's not the kind of thing we'll
>>> automatically remember to check when Python 3.4 comes out.
>
> I concede that we're partially copying the (half-line) list-shrink trigger
> algorithm here, but if that changed nothing would break. It's not like we're
> using any secret APIs.
>

So L->allocated is public API? If you can get rid of that, I think no
one would object your patch...

>>> It should be possible to find a middle route somewhere though? (At the
>>> very least prefetch "pop" of the list type rather than use getattr every
>>> time.)
>>>
>>> Dag Sverre
>>> ___
>>> Cython-dev mailing list
>>> Cython-dev@codespeak.net
>>> http://codespeak.net/mailman/listinfo/cython-dev
>>>
>>
>> I agree with Stefan and Dag...
>
> For the no argument case, there really is hardly any code duplication. Also,
> we don't do any reallocation or anything, we just note that if the number of
> items get small enough we dispatch to the generic case. I think the
> resulting 5x speedup is worth it.
>
>> Why not use PyList_GET_ITEM(L) + Py_INCREF() + PyList_SetSlice() ??
>> This way, you could easily support "L.pop(index)"...
>
> Have you seen the code for PyList_SetSlice? It's way more generic (and
> complicated) than is needed in this case. And if one were to use the methods
> mentioned above, one would still need all the bounds checking code that's
> there (the only savings would be calling PyList_SetSlice rather than having
> a manual loop, at the expense of extra memory management). Again, I'd argue
> it's worth an 10x speedup.
>

Robert, "premature optimization is the root of all evil"... You are
trying to be faster than CPython!! list.pop() is implemented more or
less with PyList_SetSlice !!


> Ideally there should be a PyList_Pop method in the C API, and more efficient
> than the one that's there, but that's not the case.
>

Yes, of course... And if it were, it would likely be implemented with
PyList_SetSlice, as the list.pop() method is currently is... Or core
CPython devs are truly lazy, or you Robert are wasting valuable time
in micro-optimizations ...



> ---
>
> In [1]: from benchmark_pop import *
>
> In [2]: L = range(10**7)
>
> In [3]: time cython_pop(L, 10**7)
> CPU times: user 0.15 s, sys: 0.00 s, total: 0.16 s
> Wall time: 0.16 s
>
> In [5]: L = range(10**7)
>
> In [6]: time python_pop(L, 10**7)
> CPU times: user 0.78 s, sys: 0.00 s, total: 0.78 s
> Wall time: 0.78 s
>
> In [8]: L = range(10**7)
>
> In [9]: time cython_pop_index(L, 10**7-1, -2)
> CPU times: user 0.21 s, sys: 0.00 s, total: 0.21 s
> Wall time: 0.21 s
>
> In [11]: L = range(10**7)
>
> In [12]: time python_pop_index(L, 10**7-1, -2)
> CPU times: user 2.38 s, sys: 0.01 s, total: 2.39 s
> Wall time: 2.40 s
>
>
> ___
> Cython-dev mailing list
> Cython-dev@codespeak.net
> http://codespeak.net/mailman/listinfo/cython-dev
>
>



-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Robert Bradshaw

On Nov 3, 2009, at 7:08 AM, Lisandro Dalcin wrote:


On Tue, Nov 3, 2009 at 12:28 PM, Dag Sverre Seljebotn
 wrote:

Stefan Behnel wrote:

Robert Bradshaw, 03.11.2009 10:22:


On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:


I think Cython could generate a custom, inline function
implementing l.pop([n]) and fast dispatch to it if the object is
exactly a list ...


http://hg.cython.org/cython-devel/rev/2e3dda4a7d23



Hmm, is this really worth the code duplication with CPython? I  
mean, we are
partly copying the list-shrink trigger algorithm here, which I  
would prefer

to consider an implementation detail of the list type.


I second this, this is the kind of code I don't like to see Cython
output. At the very least, one should #IFDEF it to only kick in with
known good Python versions...it's not the kind of thing we'll
automatically remember to check when Python 3.4 comes out.


I concede that we're partially copying the (half-line) list-shrink  
trigger algorithm here, but if that changed nothing would break. It's  
not like we're using any secret APIs.


It should be possible to find a middle route somewhere though? (At  
the
very least prefetch "pop" of the list type rather than use getattr  
every

time.)

Dag Sverre
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev



I agree with Stefan and Dag...


For the no argument case, there really is hardly any code duplication.  
Also, we don't do any reallocation or anything, we just note that if  
the number of items get small enough we dispatch to the generic case.  
I think the resulting 5x speedup is worth it.



Why not use PyList_GET_ITEM(L) + Py_INCREF() + PyList_SetSlice() ??
This way, you could easily support "L.pop(index)"...


Have you seen the code for PyList_SetSlice? It's way more generic (and  
complicated) than is needed in this case. And if one were to use the  
methods mentioned above, one would still need all the bounds checking  
code that's there (the only savings would be calling PyList_SetSlice  
rather than having a manual loop, at the expense of extra memory  
management). Again, I'd argue it's worth an 10x speedup.


Ideally there should be a PyList_Pop method in the C API, and more  
efficient than the one that's there, but that's not the case.


- Robert


---

In [1]: from benchmark_pop import *

In [2]: L = range(10**7)

In [3]: time cython_pop(L, 10**7)
CPU times: user 0.15 s, sys: 0.00 s, total: 0.16 s
Wall time: 0.16 s

In [5]: L = range(10**7)

In [6]: time python_pop(L, 10**7)
CPU times: user 0.78 s, sys: 0.00 s, total: 0.78 s
Wall time: 0.78 s

In [8]: L = range(10**7)

In [9]: time cython_pop_index(L, 10**7-1, -2)
CPU times: user 0.21 s, sys: 0.00 s, total: 0.21 s
Wall time: 0.21 s

In [11]: L = range(10**7)

In [12]: time python_pop_index(L, 10**7-1, -2)
CPU times: user 2.38 s, sys: 0.01 s, total: 2.39 s
Wall time: 2.40 s



benchmark_pop.pyx
Description: Binary data
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Lisandro Dalcin
On Tue, Nov 3, 2009 at 12:28 PM, Dag Sverre Seljebotn
 wrote:
> Stefan Behnel wrote:
>> Robert Bradshaw, 03.11.2009 10:22:
>>
>>> On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:
>>>
 I think Cython could generate a custom, inline function
 implementing l.pop([n]) and fast dispatch to it if the object is
 exactly a list ...

>>> http://hg.cython.org/cython-devel/rev/2e3dda4a7d23
>>>
>>
>> Hmm, is this really worth the code duplication with CPython? I mean, we are
>> partly copying the list-shrink trigger algorithm here, which I would prefer
>> to consider an implementation detail of the list type.
>>
> I second this, this is the kind of code I don't like to see Cython
> output. At the very least, one should #IFDEF it to only kick in with
> known good Python versions...it's not the kind of thing we'll
> automatically remember to check when Python 3.4 comes out.
>
> It should be possible to find a middle route somewhere though? (At the
> very least prefetch "pop" of the list type rather than use getattr every
> time.)
>
> Dag Sverre
> ___
> Cython-dev mailing list
> Cython-dev@codespeak.net
> http://codespeak.net/mailman/listinfo/cython-dev
>

I agree with Stefan and Dag...

Why not use PyList_GET_ITEM(L) + Py_INCREF() + PyList_SetSlice() ??
This way, you could easily support "L.pop(index)"...


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Dag Sverre Seljebotn
Stefan Behnel wrote:
> Robert Bradshaw, 03.11.2009 10:22:
>   
>> On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:
>> 
>>> I think Cython could generate a custom, inline function
>>> implementing l.pop([n]) and fast dispatch to it if the object is
>>> exactly a list ...
>>>   
>> http://hg.cython.org/cython-devel/rev/2e3dda4a7d23
>> 
>
> Hmm, is this really worth the code duplication with CPython? I mean, we are
> partly copying the list-shrink trigger algorithm here, which I would prefer
> to consider an implementation detail of the list type.
>   
I second this, this is the kind of code I don't like to see Cython 
output. At the very least, one should #IFDEF it to only kick in with 
known good Python versions...it's not the kind of thing we'll 
automatically remember to check when Python 3.4 comes out.

It should be possible to find a middle route somewhere though? (At the 
very least prefetch "pop" of the list type rather than use getattr every 
time.)

Dag Sverre
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Stefan Behnel

Robert Bradshaw, 03.11.2009 10:22:
> On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:
>> I think Cython could generate a custom, inline function
>> implementing l.pop([n]) and fast dispatch to it if the object is
>> exactly a list ...
> 
> http://hg.cython.org/cython-devel/rev/2e3dda4a7d23

Hmm, is this really worth the code duplication with CPython? I mean, we are
partly copying the list-shrink trigger algorithm here, which I would prefer
to consider an implementation detail of the list type.

Stefan
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-11-03 Thread Robert Bradshaw
On Oct 25, 2009, at 11:53 AM, Lisandro Dalcin wrote:

> On Sun, Oct 25, 2009 at 2:39 PM, Sturla Molden   
> wrote:
>> I just realized the Python does not provide a fast way of popping an
>> item from a list, just append. I.e. there is a PyList_Append, but no
>> PyList_Pop. And in Python's code for _listobject.c, the function  
>> listpop
>> is declared static. When using a list as a stack (a common case for
>> Python lists), a fast pop is just as important as a fast append.
>>
>
> Indeed. +1 for the idea, but...
>
>> Since the function listpop is static, the fastest way to pop a list  
>> is
>> to grab the method named "pop" from the PyTypeObject. This might  
>> not be
>> obvious to everyone, so I think it is an optimization that Cython  
>> should do.
>>
>
> -0.5 for the implementation... The hack is cute, but still have to
> make a listpop(self,args), were you need to create a tuple to make the
> call... I think Cython could generate a custom, inline function
> implementing l.pop([n]) and fast dispatch to it if the object is
> exactly a list ...

http://hg.cython.org/cython-devel/rev/2e3dda4a7d23

- Robert

___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


Re: [Cython] optimizing list.pop

2009-10-25 Thread Lisandro Dalcin
On Sun, Oct 25, 2009 at 2:39 PM, Sturla Molden  wrote:
> I just realized the Python does not provide a fast way of popping an
> item from a list, just append. I.e. there is a PyList_Append, but no
> PyList_Pop. And in Python's code for _listobject.c, the function listpop
> is declared static. When using a list as a stack (a common case for
> Python lists), a fast pop is just as important as a fast append.
>

Indeed. +1 for the idea, but...

> Since the function listpop is static, the fastest way to pop a list is
> to grab the method named "pop" from the PyTypeObject. This might not be
> obvious to everyone, so I think it is an optimization that Cython should do.
>

-0.5 for the implementation... The hack is cute, but still have to
make a listpop(self,args), were you need to create a tuple to make the
call... I think Cython could generate a custom, inline function
implementing l.pop([n]) and fast dispatch to it if the object is
exactly a list ...

>
>
> cdef extern from "Python.h":
>
>    ctypedef object (*PyCFunction)(object self, object args)
>
>    ctypedef struct PyMethodDef:
>        char *ml_name
>        PyCFunction ml_meth
>
>    ctypedef struct PyTypeObject:
>        PyMethodDef *tp_methods
>
>    PyTypeObject PyList_Type
>
>
> cdef PyCFunction __listpop = NULL
>
>
> cpdef listpop(list self, Py_ssize_t n=-1):
>
>    global __listpop
>
>    cdef PyMethodDef *ptr
>    cdef int ok = 0
>
>    if not __listpop:
>        ptr = PyList_Type.tp_methods;
>        while ptr:
>            if str(ptr[0].ml_name) == "pop":
>               __listpop = ptr.ml_meth
>               ok = 1
>               break
>            ptr += 1
>        if not ok:
>            raise ValueError, \
>              'Could not find method "pop" in list type object'
>    return __listpop(self, (int(n),))
>
>
>
> ___
> Cython-dev mailing list
> Cython-dev@codespeak.net
> http://codespeak.net/mailman/listinfo/cython-dev
>



-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev


[Cython] optimizing list.pop

2009-10-25 Thread Sturla Molden
I just realized the Python does not provide a fast way of popping an 
item from a list, just append. I.e. there is a PyList_Append, but no 
PyList_Pop. And in Python's code for _listobject.c, the function listpop 
is declared static. When using a list as a stack (a common case for 
Python lists), a fast pop is just as important as a fast append.

Since the function listpop is static, the fastest way to pop a list is 
to grab the method named "pop" from the PyTypeObject. This might not be 
obvious to everyone, so I think it is an optimization that Cython should do.

Sturla Molden


cdef extern from "Python.h":

ctypedef object (*PyCFunction)(object self, object args)

ctypedef struct PyMethodDef:
char *ml_name
PyCFunction ml_meth

ctypedef struct PyTypeObject:
PyMethodDef *tp_methods

PyTypeObject PyList_Type


cdef PyCFunction __listpop = NULL


cpdef listpop(list self, Py_ssize_t n=-1):

global __listpop

cdef PyMethodDef *ptr
cdef int ok = 0

if not __listpop:
ptr = PyList_Type.tp_methods;
while ptr:
if str(ptr[0].ml_name) == "pop":
   __listpop = ptr.ml_meth
   ok = 1
   break
ptr += 1
if not ok:
raise ValueError, \
  'Could not find method "pop" in list type object'
return __listpop(self, (int(n),))



___
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev