Hi Andrea,

On Wed, Jul 18, 2012 at 12:14 AM, Andrea Cesari
<[email protected]> wrote:
> I know that, obviously, C and python type are differently. In my opinion in
> the kernel i should declare a triple pointer, but reading some pycuda
> examples, i suppose that for C, each numpy array is seen as a single
> pointer.

Yes, numpy arrays are passed to kernels as pointers to flat arrays. If
your arrays contain multi-dimensional data, you have to construct flat
index manually inside the kernel using threadIdx.*, blockDim.*,
blockIdx.* etc.

So your kernel should be written as:

__global__ void doublify(float *a)
{
  int idx = threadIdx.x + blockIdx.x * blockDim.x; // x coordinate
(numpy axis 2)
  int idy = threadIdx.y + blockIdx.y * blockDim.y; // y coordinate
(numpy axis 1)
  int x_width = blockDim.x * gridDim.x;
  int y_width = blockDim.y * gridDim.y;
  for(int idz = 0; idz < 10; idz++) // loop over z coordinate (numpy axis 0)
  {
      int flat_id = idx + x_width * idy + (x_width * y_width) * idz;
      a[flat_id] *= 2;
  }
}

Alternative solution would be using
pycuda.elementwise.ElementwiseKernel, since you do not really need
your array's shape inside the kernel — all threads are independent.

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to