Hi,
I had a problem when explicitly passing "shared_size = None" to kernel
prepared_call.
An example code is attached.
Error message:
ArgumentError: Python argument types in
Function._launch_kernel(Function, tuple, tuple, str, NoneType, NoneType)
did not match C++ signature:
_launch_kernel(pycuda::function {lvalue}, pycudaboost::python::tuple,
pycudaboost::python::tuple, pycudaboost::python::api::object, unsigned int,
pycudaboost::python::api::object)
import pycuda.autoinit
import pycuda.driver as drv
import numpy
import pycuda.gpuarray as gpuarray
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
const int i = threadIdx.x;
dest[i] = a[i] * b[i];
}
""")
multiply_them = mod.get_function("multiply_them")
multiply_them.prepare([numpy.intp, numpy.intp, numpy.intp])
d_a = gpuarray.to_gpu(numpy.random.randn(400).astype(numpy.float32))
d_b = gpuarray.to_gpu(numpy.random.randn(400).astype(numpy.float32))
d_dest = gpuarray.zeros_like(d_a)
multiply_them.prepared_call( (1,1), (400,1,1), d_dest.gpudata, d_a.gpudata, d_b.gpudata, shared_size = None)
I'm using 2012.1.
Best,
Yiyin
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda