> > cdef class Model:
> >
> >     cdef public double a1, a2, b1, b2, d1, d2
> >
> >         def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t):
> >             cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3)
> >
> >             yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 +
> > self.b1*y[0])
> >             yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) -
> > self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1]
> >             yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) -
> > self.d2*y[2]
> >
> >             return yprime
> >
> 
> The amount of work that is done in this function is almost nothing -- i.e.
> "n" is hard-coded to 3. So I think you'll find that the thing killing
> performance here is calling the function and passing the arguments.
> 
This is generally true this is why I use f2py, and now cython. But somehow
f2py is being more efficient here, maybe it is the extra np.empty each call,
which is done in fortran in the other code, is there any way to set the
size of the ndarray without the python call?

> For starters, use typed polymorphism: Make the function "cpdef" and give
> it another name, have a parent class "AbstractModel" with the same
> function in it, and in the calling code type the callee as AbstractModel.
> 
The calling code is python, does this work in this case? I am not sure what
you mean my typing then. 

> After that it would help to pass around raw float* rather than NumPy
> objects in this case when n is so small (unfortunately, there's no way to
> pass around an acquired buffer between functions. I have ideas of course,
> but they are not implemented.)
> 
Maybe this is what f2py does, it can pass the buffer around. float* isn't
really an option as the calling program is python.

> Without knowing the nature of the caller (where the real bottlenetck
> likely is) it is difficult to give better advice.
> 
But the only difference in the caller between the f2py version and this one
is this file. So the speed difference must be from the function calls etc as
you have said, but I don't understand how the overhead can be from the caller
as the code for this is identical (and not compiled).

the code is basically:

for model.b1 in np.linspace(0, 2.6, 100):
    odeint(model, y0, t0)

where model is defined in a separate file, in one case as a fortran
object generated by f2py and in this case by cython.

Anyway thanks for the comments, I know how hard it is to comment on partial
pieces of code. I will do some more investigating, but I am starting to think
Fortran is the better option as f2py seems to make the calling bridge more
efficient [currently . . .] (which is generically the bottleneck in our
problems).

Gabriel
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to