Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Andrew Wagner Sat, 13 Jun 2009 20:37:24 -0700

Thanks again, Nicolas!  I suspected that was the case...

If this is still the case in the latest pycuda, the documentation for
get_global() at


http://documen.tician.de/pycuda/driver.html#code-on-the-device-modules-and-functions

needs to be corrected.

I have attached an short demo for using constant memory that does
this, and seems to work at least for one easy case.  I release it
under whatever license the other pycuda demo scripts are under, so it
can be included in the next release if Andreas sees fit.

Unfortunately, this means something harder to debug is causing my
garbage results...

Cheers,
Drew

import numpy
import pycuda.autoinit
import pycuda.driver as cuda

code = '''
__device__ __constant__ float constantArray[32];

__global__ void copy_constant_into_global( float* globalResultArray){
    globalResultArray[threadIdx.x] = constantArray[threadIdx.x];
}
'''
module = cuda.SourceModule(code)
copy_constant_into_global = module.get_function("copy_constant_into_global")
constantArray = module.get_global('constantArray')[0]
hostArray = numpy.random.randint(0,255,(32,)).astype('float32')
print 'Host array: ',  hostArray
globalResultArray = cuda.mem_alloc_like(hostArray)
cuda.memcpy_htod(constantArray,  hostArray)
copy_constant_into_global(globalResultArray,  grid = (1, 1),  block=(32, 1, 1))
hostResultArray = numpy.zeros_like(hostArray)
cuda.memcpy_dtoh(hostResultArray, globalResultArray)
print 'Result array: ',  hostResultArray
assert (hostResultArray == hostArray).all,  'There was some error!'
boo = raw_input('Pausing so you can look at results... <Enter> to finish...')

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Reply via email to