On Tue, 25 Aug 2009, Adam Ginsburg wrote:

>>> Adam Ginsburg wrote:
>
> My test data set is 10^4 elements, which is typical for what I expect
> to deal with but it could go up an order of magnitude.  Of course, I
> need to do 10^4 element sets ~10^4 times each...
>
>
>> I'd add a mode='c' option to all the cnp.ndarray's -- this will speed up 
>> access.
>>
>> A future optimization would be the @cython.boundscheck(False) directive.
>
> Do you mean future as in "don't try to use it now" or "should use it
> if it's safe to proceed without boundary checking"?  Also, this looks
> like a decorator to me, but I couldn't compile when I put it in front
> of my function definition.

Did you import cython?

>> In the inner loops get rid of *all* python operations.  For example:
>>
>> line 63: z    = z[z>=xmin]
>>    this is heavy on numpy operations (allocates & discards a temp
>> boolean array every time, etc) and might kill performance.
>>
>> line 64:  n = float(...)  ==> n = <float>(...)  # replace a Python
>> cast with a C-level cast
>
> I think I came up with a way around these both.  <float> didn't work,
> though - I received errors when I tried it.
>
>> line 80:  cf   = 1-(xmin/z)**a
>>    This is again heavy on Python -- you might do better with a loop
>> over z and use pow from math.h.
>
> OK, I put cx and cf into loops and switched from ** to pow.
>
>> Hope this helps (and let us know if it doesn't).
>
> I found a factor of 2-3 improvement.  In particular, I left one python
> float() in, and that made python and cython go ~the same speed.  When
> I changed it to <float>, it dropped the cython time by 2.

This is actually a common occurance--sometimes that last Python operation 
takes 10x as long as everything else you're doing, so until you eliminate 
them all you won't see the huge jump in speed.

> Now execution times are:
> n=2e4
> python~3x fortran
> cython~1.5x fortran

10^4 still isn't a huge array, but I would be curious how much a speed 
increase you could get without unrolling all your loops (i.e. a minimal 
modification to your code). Might be 1.6, might not be.

> So this may be as fast as I can get.  I'm a little confused about
> fortran getting slower relative to python as n gets larger, but that
> is probably some sort of failure on my part.

This is due to the fact that the array processing happens at near the same 
speed, so as the array gets larger, the (relative) significance of the 
Python function call overhead goes away.

- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to