Hi vretiel, I'm getting error = 0.0 both for float32 and float64.
If you compare the arrays returned from the kernel with the "reference" arrays obtained on CPU using the same algorithm, is it the difference between a part of the elements that produces the nonzero error, or are all the values wrong? How big is the error (as compared to the actual values in the array)? Does decreasing OUTPUT_SIZE or grid size help? On Sun, Oct 7, 2012 at 10:14 PM, vretiel <[email protected]> wrote: > Hi all, > > Below is a code snippet where I call the same kernel on 2 empty arrays - the > data returned from both calls should be the same - but it's not! > > When I run this code, I get a random error each time - I have no idea why. > > Anybody seen anything similar? > > Thanks. > > ----------Code------------------------------------------------------------------------------------- > import numpy as np > import string > > #pycuda stuff > import pycuda.driver as drv > import pycuda.autoinit > > from pycuda.compiler import SourceModule > > class MC: > > cudacodetemplate = """ > #include <stdio.h> > > #define OUTPUT_SIZE 286 > > typedef $PRECISION REAL; > > extern "C" > { > __global__ void test_coeff ( REAL* results ) > { > int id = blockDim.x * blockIdx.x + threadIdx.x; > > int out_index = OUTPUT_SIZE * id; > for (int i=0; i<OUTPUT_SIZE; i++) > { > results[out_index+i]=log((double)(id+1)); > } > } > } > """ > > def __init__(self, size, prec = np.float32): > drv.limit.MALLOC_HEAP_SIZE = 1024*1024*800 > > self.size = size > self.prec = prec > template = string.Template(MC.cudacodetemplate) > self.cudacode = template.substitute( PRECISION = 'float' if > prec==np.float32 else 'double') > > self.module = pycuda.compiler.SourceModule(self.cudacode, > no_extern_c=True, options=['--ptxas-options=-v']) > > def test(self, out_size): > test = np.zeros( ( 64, out_size*(2**self.size) ), dtype=self.prec ) > test2 = np.zeros( ( 64, out_size*(2**self.size) ), dtype=self.prec ) > > test_coeff = self.module.get_function ('test_coeff') > test_coeff( drv.Out(test), block=(2**self.size,1,1), grid=( 64, 1 ) > ) > test_coeff( drv.Out(test2), block=(2**self.size,1,1), grid=( 64, 1 ) > ) > error = (test-test2) > return error > > if __name__ == '__main__': > p1 = MC ( 5, np.float64 ) > err = p1.test(286) > print err.max() > print err.min() > > > > -- > View this message in context: > http://pycuda.2962900.n2.nabble.com/Going-Crazy-tp7574863.html > Sent from the PyCuda mailing list archive at Nabble.com. > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
