Received from Ahmed Fasih on Wed, Nov 07, 2012 at 09:02:15PM EST: > Hi folks, please take my continued stream of questions as an indication > of the versatility and flexibility of PyCUDA, instead of my slowness > with it. > > I have PyCUDA that allocates memory on the host and on the device to > store the output of a kernel calculation. E.g., > > ### start code > import numpy > import pycuda.driver as cuda > import pycuda.autoinit > image = numpy.zeros((1024, 1024), dtype=numpy.complex64, order='C') > image_gpu = cuda.mem_alloc(image.nbytes) > > # kernel invocation using image_gpu, works fine. > ### end code > > I'd like to replace this idiom with one involving page-locked memory, > specifically device-mapped memory. I have something as follows: > > ### start code > image = cuda.aligned_empty((1024,1024), dtype=numpy.complex64, > order='C') > image_gpu = cuda.register_host_memory(image, > flags=cuda.mem_host_register_flags.DEVICEMAP) > ### end code > > Same kernel invocation using image_gpu fails, > "pycuda._driver.LogicError: cuLaunchKernel failed: invalid value" > > I also try sending the kernel the integer (pointer) returned by > register_host_memory()'s returned value's basemap's > get_device_pointer(). In other words, I tried this: > > ### start code > image_gpu_return = cuda.register_host_memory(self.image, > flags=cuda.mem_host_register_flags.DEVICEMAP) > image_gpu = image_gpu_return.base.get_device_pointer() > > # kernel invocation using image_gpu, fails. > ### end code > > Note that it is the kernel invocation that throws the exception---the > aligned_empty & register_host_memory & get_device_pointer calls are > fine. > > Anyone used this CUDA feature in PyCUDA and can shed some light on how > to do it right? Thanks, > Ahmed > > PS. Perhaps a note on why I'm trying to use device-mapped memory: any > memory allocated this way will be written only once by the kernel, and > it might not fit entirely in GPU memory.
Not sure why you are observing a failure; the following gists run without error on my system (CUDA 4.2.9, PyCUDA 2012.1, NVIDIA driver 295.40, 64-bit Linux): https://gist.github.com/4036297 https://gist.github.com/4036292 L.G. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
