Hi Andrea,
On Wed, Jul 18, 2012 at 12:14 AM, Andrea Cesari
<[email protected]> wrote:
> I know that, obviously, C and python type are differently. In my opinion in
> the kernel i should declare a triple pointer, but reading some pycuda
> examples, i suppose that for C, each numpy array is seen as a single
> pointer.
Yes, numpy arrays are passed to kernels as pointers to flat arrays. If
your arrays contain multi-dimensional data, you have to construct flat
index manually inside the kernel using threadIdx.*, blockDim.*,
blockIdx.* etc.
So your kernel should be written as:
__global__ void doublify(float *a)
{
int idx = threadIdx.x + blockIdx.x * blockDim.x; // x coordinate
(numpy axis 2)
int idy = threadIdx.y + blockIdx.y * blockDim.y; // y coordinate
(numpy axis 1)
int x_width = blockDim.x * gridDim.x;
int y_width = blockDim.y * gridDim.y;
for(int idz = 0; idz < 10; idz++) // loop over z coordinate (numpy axis 0)
{
int flat_id = idx + x_width * idy + (x_width * y_width) * idz;
a[flat_id] *= 2;
}
}
Alternative solution would be using
pycuda.elementwise.ElementwiseKernel, since you do not really need
your array's shape inside the kernel — all threads are independent.
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda