Mark Lodato wrote:
> On Sat, Oct 3, 2009 at 4:46 AM, Dag Sverre Seljebotn
> <[email protected]> wrote:
>> In the interest of fixing
>>
>> http://trac.cython.org/cython_trac/ticket/281
>>
>> I'm doing some benchmarks which are rather surprising to me.
>> Explanations? Unless something is wrong it seems I can safely replace
>> __pyx_lineno and friends with thread variables (on Python versions which
>> support it, haven't checked that yet).
>>
>> In [12]: %timeit dagss.globalvar(1e3)
>> 100 loops, best of 3: 6.18 ms per loop
>>
>> In [13]: %timeit dagss.threadvar(1e3)
>> 10000 loops, best of 3: 553 micros per loop
>>
>> In [14]: %timeit dagss.threadvar(1e6)
>> 10 loops, best of 3: 153 ms per loop
> 
> 
> I don't get the same result at all:
> 
> $ gcc -O0 -shared -I/usr/include/python2.6 -g -fPIC dagss.c -o dagss.so
> 
> In [2]: %timeit dagss.globalvar(1e3)
> 100000 loops, best of 3: 10.6 µs per loop
> 
> In [3]: %timeit dagss.threadvar(1e3)
> 10000 loops, best of 3: 446 µs per loop
> 
> In [4]: %timeit dagss.globalvar(1e6)
> 100 loops, best of 3: 10.3 ms per loop
> 
> In [5]: %timeit dagss.threadvar(1e6)
> 10 loops, best of 3: 147 ms per loop
> 
> $ gcc -O1 -shared -I/usr/include/python2.6 -g -fPIC dagss.c -o dagss.so
> 
> In [2]: %timeit dagss.globalvar(1e3)
> 1000000 loops, best of 3: 1.67 µs per loop
> 
> In [3]: %timeit dagss.threadvar(1e3)
> 10000 loops, best of 3: 451 µs per loop
> 
> In [4]: %timeit dagss.globalvar(1e6)
> 1000 loops, best of 3: 1.41 ms per loop
> 
> In [5]: %timeit dagss.threadvar(1e6)
> 10 loops, best of 3: 145 ms per loop
> 
> * -O2 or -O3:
> 
> With these optimization levels, the compiler optimizes out the loop.
> 
> * Setup:
> 
> The only change I made was to change i from an 'int' to a 'long' to
> remove a compiler warning.
> 
> Cython: version 0.11.3
> gcc: version 4.3.3
> OS: Ubuntu 9.04
> Processor: Intel Core 2 Duo E6300
> Compile line: gcc -shared -I/usr/include/python2.6 -g -fPIC -O0
> dagss.c -o dagss.so

Now I feel stupid ;-)

I used runtests.py to compile, which turned on the refnanny, which meant 
the getter/setter functions had a much larger overhead than the function 
calls to the CPython lib.

Thanks for doing this, I'll just trust your numbers. IMO this is not too 
horrible, and I think thread local variables are cheap enough that we 
can safely use them to propagate exceptions -- *if* we for some reason 
don't want __pyx_lineno and __pyx_clineno as function-local variables. 
(Which we shouldn't IMO if there's any chance at all that it can 
decrease normal running performance.)

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to