On Apr 15, 2009, at 7:31 PM, Brent Pedersen wrote:

> On Wed, Apr 15, 2009 at 11:25 AM, Stefan Behnel  
> <[email protected]> wrote:
>> Hi,
>>
>> thanks for sharing that.
>>
>> Brent Pedersen wrote:
>>> assuming i haven't done anything stupid
>>
>> No, nothing stupid, but something that can reduce the  
>> comparability of the
>> timings. You are creating a 1000000 item list on each benchmark,  
>> using a
>> call to range() in some cases and a list comprehension in others.
>>
>> It's usually better to move initialisations out of the timings,  
>> e.g. by
>> creating a large range() object once and re-using it. That reduces  
>> the
>> impact of unrelated operations on the absolute numbers.
>>
>
> ah, i see, updated that. fixing that makes the python constructor look
> even slower.
> now it assumes that creating a list comprehension without assgning to
> a variable is
> the same as calling a function that returns an array--also without  
> assigning.
>
> here are the new timings:
>
> PY_NEW on Cython class: 1.137
> __init__ on Python class: 28.468
> __init__ on Python class with slots: 9.936
> batch PY_NEW total: 0.821 , interval only: 0.363
> batch __init__ on Cython class total 0.975 , interval_only: 0.524
> __init__ on Cython class 1.154
>
> so for this case using PY_NEW macro actual doesnt improve speed that
> much over a cdef'ed class.
> especially if using a "batch" method is applicable (as it is for my  
> use-case).

Here's some more data points, specific to Sage (Python 2.5, Cython 0.11)

-------------------

%cython

from sage.rings.integer cimport Integer

cdef class A:
     pass

def time_py_new(long N):
     cdef long i
     for i from 0 <= i < N:
         z = PY_NEW(Integer)

def time_py_init(long N):
     cdef long i
     for i from 0 <= i < N:
         z = Integer()

def time_py_new_A(long N):
     cdef long i
     for i from 0 <= i < N:
         z = PY_NEW(A)

def time_py_init_A(long N):
     cdef long i
     for i from 0 <= i < N:
         z = A()

--------------------

sage: time time_py_init(10**7)
Time: CPU 0.67 s, Wall: 0.68 s
sage: time time_py_new(10**7)
Time: CPU 0.16 s, Wall: 0.17 s
sage: time time_py_init_A(10**7)
Time: CPU 0.66 s, Wall: 0.67 s
sage: time time_py_new_A(10**7)
Time: CPU 0.47 s, Wall: 0.48 s

Note that I have an empty constructor in both cases. In summary,  
Integer has a custom tp_new slot with a pool to avoid allocation/ 
deallocation overhead, and PY_NEW here saves quite a bit. The 30%  
speed difference for a standard cdef class seems to hold about right-- 
probably two thirds of the time is probably in allocating memory from  
the heap (and also releasing it, in my test).

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to