Amir wrote:
> The end of the numpy section of the cython docs suggests the possibility to 
> call
> numpy/scipy functions without Python call overhead. How can this be done?
> 
> A test script at bottom is 1.8 times faster when I expand numpy calls into
> simple for loops (n,m = 1000,1500). weave.inline is 2.7 times faster. Looking 
> at
> the cython -a output, not sure where most of that time is lost. Looks like
> strides generate many more calls and dot products are done using Python calls
> for multiplications, for example. 

Yes, unfortunately that's what the status is; the only thing that is 
optimized by Cython is element indexing (i.e. your theta[j] and v[j]). 
This is where you'd really remove a bottleneck in some code, but it 
means that "mixed" code like yours doesn't benefit that much.

Remember though that in your case, as n and m goes to infinity, the 
Python overhead will be rather small.

If you want to, you could have a look at calling e.g. dot using the 
NumPy C API defined in the NumPy header files. Then you can supply an 
implementation in Cython/Includes/numpy.pxd like this:

cdef inline dot(a, b):
     ...

Then that would remove a (quite small) part of the overhead. Further 
improvements would require changes to NumPy.

A full implementation of things alluded to in 
http://wiki.cython.org/enhancements/buffersyntax could potentially fix 
this as well, but the status is still uncertain and the timeframe if it 
happens about a year from now.

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to