Thanks again, Nicolas! I suspected that was the case...
If this is still the case in the latest pycuda, the documentation for
get_global() at
http://documen.tician.de/pycuda/driver.html#code-on-the-device-modules-and-functions
needs to be corrected.
I have attached an short demo for using constant memory that does
this, and seems to work at least for one easy case. I release it
under whatever license the other pycuda demo scripts are under, so it
can be included in the next release if Andreas sees fit.
Unfortunately, this means something harder to debug is causing my
garbage results...
Cheers,
Drew
import numpy
import pycuda.autoinit
import pycuda.driver as cuda
code = '''
__device__ __constant__ float constantArray[32];
__global__ void copy_constant_into_global( float* globalResultArray){
globalResultArray[threadIdx.x] = constantArray[threadIdx.x];
}
'''
module = cuda.SourceModule(code)
copy_constant_into_global = module.get_function("copy_constant_into_global")
constantArray = module.get_global('constantArray')[0]
hostArray = numpy.random.randint(0,255,(32,)).astype('float32')
print 'Host array: ', hostArray
globalResultArray = cuda.mem_alloc_like(hostArray)
cuda.memcpy_htod(constantArray, hostArray)
copy_constant_into_global(globalResultArray, grid = (1, 1), block=(32, 1, 1))
hostResultArray = numpy.zeros_like(hostArray)
cuda.memcpy_dtoh(hostResultArray, globalResultArray)
print 'Result array: ', hostResultArray
assert (hostResultArray == hostArray).all, 'There was some error!'
boo = raw_input('Pausing so you can look at results... <Enter> to finish...')
_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net