I tried tweak my Cython code for performance by manually inlining a small function, and ended up with a less performant code. I must confess I don't really understand what is going on here. If somebody has an explaination, I'd be delighted. The code follows.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ from numpy import zeros # Make sure numpy is initialized. include "c_numpy.pxd" ############################################################################## cdef int inner_loop(float c_x, float c_y): cdef float x, y, x_buffer x = 0; y = 0 cdef int i for i in range(50): x_buffer = x*x - y*y + c_x y = 2*x*y + c_y x = x_buffer if (x*x + x*y > 100): return 50 - i def do_Mandelbrot_cython(): cdef ndarray threshold_time threshold_time = zeros((500, 500)) cdef double *tp cdef float c_x, c_y cdef int i, j c_x = -1.5 tp = <double*>threshold_time.data for i in range(500): c_y = -1 for j in range(500): tp += 1 c_y += 0.004 tp[0] = inner_loop(c_x, c_y) c_x += 0.004 return threshold_time def do_Mandelbrot_cython2(): cdef ndarray threshold_time threshold_time = zeros((500, 500)) cdef double *tp tp = <double*>threshold_time.data cdef float x, y, xbuffer, c_x, c_y cdef int i, j, n c_x = -1.5 for i in range(500): c_y = -1 for j in range(500): tp += 1 c_y += 0.004 x = 0; y = 0 for n in range(50): x_buffer = x*x - y*y + c_x y = 2*x*y + c_y x = x_buffer if (x*x + y*y > 100): tp[0] = 50 -n break c_x += 0.004 return threshold_time ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ And the timing I get are: In [2]: %timeit C.do_Mandelbrot_cython2() 10 loops, best of 3: 342 ms per loop In [3]: %timeit C.do_Mandelbrot_cython() 10 loops, best of 3: 126 ms per loop Cheers, Gaƫl _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion