On Apr 15, 2009, at 7:31 PM, Brent Pedersen wrote:
> On Wed, Apr 15, 2009 at 11:25 AM, Stefan Behnel
> <[email protected]> wrote:
>> Hi,
>>
>> thanks for sharing that.
>>
>> Brent Pedersen wrote:
>>> assuming i haven't done anything stupid
>>
>> No, nothing stupid, but something that can reduce the
>> comparability of the
>> timings. You are creating a 1000000 item list on each benchmark,
>> using a
>> call to range() in some cases and a list comprehension in others.
>>
>> It's usually better to move initialisations out of the timings,
>> e.g. by
>> creating a large range() object once and re-using it. That reduces
>> the
>> impact of unrelated operations on the absolute numbers.
>>
>
> ah, i see, updated that. fixing that makes the python constructor look
> even slower.
> now it assumes that creating a list comprehension without assgning to
> a variable is
> the same as calling a function that returns an array--also without
> assigning.
>
> here are the new timings:
>
> PY_NEW on Cython class: 1.137
> __init__ on Python class: 28.468
> __init__ on Python class with slots: 9.936
> batch PY_NEW total: 0.821 , interval only: 0.363
> batch __init__ on Cython class total 0.975 , interval_only: 0.524
> __init__ on Cython class 1.154
>
> so for this case using PY_NEW macro actual doesnt improve speed that
> much over a cdef'ed class.
> especially if using a "batch" method is applicable (as it is for my
> use-case).
Here's some more data points, specific to Sage (Python 2.5, Cython 0.11)
-------------------
%cython
from sage.rings.integer cimport Integer
cdef class A:
pass
def time_py_new(long N):
cdef long i
for i from 0 <= i < N:
z = PY_NEW(Integer)
def time_py_init(long N):
cdef long i
for i from 0 <= i < N:
z = Integer()
def time_py_new_A(long N):
cdef long i
for i from 0 <= i < N:
z = PY_NEW(A)
def time_py_init_A(long N):
cdef long i
for i from 0 <= i < N:
z = A()
--------------------
sage: time time_py_init(10**7)
Time: CPU 0.67 s, Wall: 0.68 s
sage: time time_py_new(10**7)
Time: CPU 0.16 s, Wall: 0.17 s
sage: time time_py_init_A(10**7)
Time: CPU 0.66 s, Wall: 0.67 s
sage: time time_py_new_A(10**7)
Time: CPU 0.47 s, Wall: 0.48 s
Note that I have an empty constructor in both cases. In summary,
Integer has a custom tp_new slot with a pool to avoid allocation/
deallocation overhead, and PY_NEW here saves quite a bit. The 30%
speed difference for a standard cdef class seems to hold about right--
probably two thirds of the time is probably in allocating memory from
the heap (and also releasing it, in my test).
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev