Received from Francis on Thu, Aug 04, 2011 at 08:58:33AM EDT:
> Hi Lev,
>
> Basically I'm testing, part by part, my CUDA C code and porting my kernel
> functions as PyCUDA source modules. The one I'm verifying right now is this
> part:
>
> projection_module = """
> __global__ void projection( char *List , int *l, int N, int L ) {
>
> int tid = blockIdx.x * 512 + threadIdx.x;
> int idx1 = ceilf( tid / ( N - L + 1 ) );
> int idx2 = tid % ( N - L + 1 );
>
> for ( int lcnt = 0; lcnt < L; lcnt++){
> l[ (tid * L ) + lcnt ] = List[ (idx1 * N + idx2) + lcnt ];
> }
> }
> """
>
> This works in CUDA C but surprisingly I get different values in PyCUDA.
>
> Best regards,
>
> ./francis
What happens when you pass the -use_fast_math option to nvcc in
PyCUDA? You can do this as follows:
proj = SourceModule(projection_module,
options=['-use_fast_math']).get_function('projection')
L.G.
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda