Re: [PyCUDA] GPUArray slices in CUDA Kernels

Andreas Kloeckner Mon, 02 Aug 2010 18:20:22 -0700

Hey Bryan,

first of all sorry for the late reply. I just returned from a crazy
sequence of trips and am now working my way through the backlog...

On Wed, 21 Jul 2010 20:54:14 -0700, Bryan Catanzaro 
<catan...@eecs.berkeley.edu> wrote:
> Using numpy.intp works for kernels launched by pycuda.  But it doesn't
> work for Pycuda memcpy, which complains that numpy.intp (, which seems
> aliased to numpy.int32 on my machine,) doesn't match the c++
> signature:
> 
> Boost.Python.ArgumentError: Python argument types in
>     pycuda._driver.memcpy_dtod(numpy.int32, DeviceAllocation, int)
> did not match C++ signature:
>     memcpy_dtod(unsigned int dest, unsigned int src, unsigned int size)

Ok, I'm beginning to understand what's at work here--Boost Python simply
doesn't like numpy array scalars as integer arguments. My pyublas module
fixes that to some extent, but that's really not viable here. Kernel
arguments go through the buffer interface, so they are an entirely
different affair.

At first, I was leaning towards believing that's a bug, but the more I
thought about it, the less I could actually believe it. If you use the
following guidelines, I don't think you should run into trouble:

- When storing GPU pointers in your code, use either DeviceAllocation
  objects or bare Python int (or long) objects. These two should be
  interchangeable in all things PyCUDA where a device pointer is
  requested. If you need arithmetic to work, add int() casts, but be
  aware that killing the DeviceAllocation makes your memory go away. Do
  not use numpy scalars to store pointers.

- IF you are using the unprepared kernel invocation syntax (and only
  then), you need to convert pointers to numpy.intp to pass as
  arguments.

I believe that if you use these guidelines, you should be ok.
Anything I'm overlooking?

> And it doesn't work for my own C++ functions (which call CUDA
> functions), which I'm interoperating with Pycuda.  They also expect to
> be able to extract a pointer out of gpudata, and when they get a
> numpy.int32, they die.

See above--is anything requiring that you use numpy scalars?

> The patch I attached yesterday goes a step in that direction, but
> ultimately what I really want is a C++ implementation of GPUArray.  =)

With all the code generation going on in GPUArray, I actually highly
doubt you want that, but ok. :)

Andreas

pgpG6owVFI4Wf.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] GPUArray slices in CUDA Kernels

Reply via email to