Hi, i done this and corrected to *int to simplify the problem:
now the script is:
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void thread_index(int *dest)
{
int i = threadIdx.x;
dest[i]=i;
}
""")
lung_vett=10
thread_index = mod.get_function("thread_index")
dest=numpy.zeros(lung_vett,dtype=numpy.int16);
thread_index(drv.Out(dest),block=(lung_vett,1,1))
print dest
and the result is :
[ 1 16256 1 16256 1 16256 1 16256 1 16256]
if i re-execute the script the result is:
[ 2 16256 2 16256 2 16256 2 16256 2 16256]
for each re-execution there are an increment of ones of some elements.
why? ? cuold be: [0,1,2,3,4,5,6,7,8,9].
Thank
Andrea Cesari
> To: [email protected]
> From: [email protected]
> Date: Mon, 9 Jul 2012 13:01:18 +0100
> Subject: Re: [PyCUDA] Thread Problem
>
> Andrea Cesari wrote:
>
> [...]
>
> > __global__ void thread_index(float *dest)
> ^^^^ you've said your kernel takes floats
> > dest=numpy.zeros(lung_vett)
>
> ^^^ but this creates an array of float64 (aka double) values.
>
> you want:
>
> dest = numpy.zeros(lung_vett, dtype=numpy.float32)
>
> > thread_index(drv.Out(dest),block=(lung_vett,1,1))
>
> Cheers,
> Lawrence
> --
> Lawrence Mitchell <[email protected]>
>
>
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda