On Wed, 21 Jul 2010 10:29:09 -0400, Frédéric Bastien <[email protected]> wrote:
> CudaNdarray are object that
> we created that is similar to GPUArray but with support for stride.

First of all, I don't understand why you're developing a
stride-supporting GPUArray separately from PyCUDA. (Does it depend on
all of Theano?) I'm sure lots of people would want that
functionality--including myself! Would you be willing to contribute this
code to PyCUDA in some form?

> I changed CudaNdarray to have the member gpudata to return the pointer
> on the device.

> 
> You also use the _grid and _block member of GPUArray when we call a
> pycuda function created with ElementwiseKernel. I could replicated the
> creation of those members in CudaNdarray. I don't like that idea as if
> you change the way you compute those value, I also need to do so or we
> won't have the same speed.

I don't get this--you have a number of options here. You can inherit
From ElementwiseKernel and override the invocation logic. You can make
_grid and _block computed properties of your array that actually call
PyCUDA. Or you could simply replicate _grid and _block by calling
PyCUDA's splay--without having to reimplement splay.

> I just thought of another way that would work more easily(see new
> patch attached). During the pycuda fct call if the input don't have a
> _grid and _block member, pycuda call its splay fct and use those
> value.

This inserts extra code into a heavily-traveled, somewhat
performance-relevant path in PyCUDA. Me no like.

Andreas

Attachment: pgp9CKKcaTnMj.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to