James Keaveney <[email protected]> writes:

> Thanks Andreas,
>
> If I switch out the line "cuda.memcpy_dtoh(y,Y_gpu)" and replace it with 
> something else - either another memory copy or a synchronize option like 
> pycuda.autoinit.context.synchronize(). The same also happens if I get 
> rid of explicit memory copies and use the In(), Out() and InOut() 
> methods. If I don't execute the kernel then the memory copies work 
> without errors. So something in the kernel is causing an error, but this 
> is one of the example kernels from nvidia! Can you see anything wrong 
> with it? Does it execute on anyone elses machine?
>
>      # CUDA grid
>      block_size=(4,1,1)
>      grid = (n/block_size[0],1)
>
>      # CUDA source
>      cusrc = SourceModule("""
>      __global__ void saxpy(int n, double a, double *x, double *y)
>      {
>      for (int i = blockIdx.x * blockDim.x + threadIdx.x;
>          i < n;
>          i += blockDim.x * gridDim.x)
>          {
>              y[i] = a * x[i] + y[i];
>          }
>      }
>      """)
>      SAXPY = cusrc.get_function('saxpy')
>
>      # data arrays
>      w = 500 #arbitrary
>      x = random.uniform(0,w,n) #.astype(float32) << same error with 
> either float or double
>      y = random.uniform(0,w,n) #.astype(float32)
>
>      #init gpu (input) arrays
>      a = float64(24.5)
>      n = int32(n)
>
>      SAXPY(cuda.In(n), cuda.In(a), cuda.In(x), cuda.InOut(y), grid=grid, 
> block=block_size)

Your problem is that you shouldn't use "In" on 'n'--this will pass a
pointer instead of the value of n.

Andreas

Attachment: pgpCIaPKKhzWP.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to