James Keaveney <[email protected]> writes: > Thanks Andreas, > > If I switch out the line "cuda.memcpy_dtoh(y,Y_gpu)" and replace it with > something else - either another memory copy or a synchronize option like > pycuda.autoinit.context.synchronize(). The same also happens if I get > rid of explicit memory copies and use the In(), Out() and InOut() > methods. If I don't execute the kernel then the memory copies work > without errors. So something in the kernel is causing an error, but this > is one of the example kernels from nvidia! Can you see anything wrong > with it? Does it execute on anyone elses machine? > > # CUDA grid > block_size=(4,1,1) > grid = (n/block_size[0],1) > > # CUDA source > cusrc = SourceModule(""" > __global__ void saxpy(int n, double a, double *x, double *y) > { > for (int i = blockIdx.x * blockDim.x + threadIdx.x; > i < n; > i += blockDim.x * gridDim.x) > { > y[i] = a * x[i] + y[i]; > } > } > """) > SAXPY = cusrc.get_function('saxpy') > > # data arrays > w = 500 #arbitrary > x = random.uniform(0,w,n) #.astype(float32) << same error with > either float or double > y = random.uniform(0,w,n) #.astype(float32) > > #init gpu (input) arrays > a = float64(24.5) > n = int32(n) > > SAXPY(cuda.In(n), cuda.In(a), cuda.In(x), cuda.InOut(y), grid=grid, > block=block_size)
Your problem is that you shouldn't use "In" on 'n'--this will pass a pointer instead of the value of n. Andreas
pgpCIaPKKhzWP.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
