If the number of states you want to initialize isn't divisible by 64, then you will have to launch threads that have no curandState to initialize.
On Dec 8, 2011, at 10:24 PM, Thomas Wiecki wrote: > Out of curiosity, is there a specific reason to supply the nthreads argument? > > On Thu, Dec 8, 2011 at 4:46 PM, Anthony LaTorre <[email protected]> wrote: >> >> >> Here is an example: >> >> >> >> >> >> import numpy as np >> import pycuda.tools >> from pycuda import characterize >> import pycuda.driver as cuda >> import pycuda.compiler >> from pycuda import gpuarray as ga >> >> >> >> >> >> >> init_rng_src = """ >> #include <curand_kernel.h> >> >> extern "C" >> { >> >> __global__ void init_rng(int nthreads, curandState *s, unsigned long long >> seed, unsigned long long offset) >> { >> int id = blockIdx.x*blockDim.x + threadIdx.x; >> >> if (id >= nthreads) >> return; >> >> curand_init(seed, id, offset, &s[id]); >> } >> >> } // extern "C" >> """ >> >> def get_rng_states(size, seed=1): >> "Return `size` number of CUDA random number generator states." >> rng_states = >> cuda.mem_alloc(size*characterize.sizeof('curandStateXORWOW', '#include >> <curand_kernel.h>')) >> >> module = pycuda.compiler.SourceModule(init_rng_src, no_extern_c=True) >> init_rng = module.get_function('init_rng') >> >> init_rng(np.int32(size), rng_states, np.uint64(seed), np.uint64(0), >> block=(64,1,1), grid=(size//64+1,1)) >> >> return rng_states >> >> >> On Thu, Dec 8, 2011 at 3:37 PM, Thomas Wiecki <[email protected]> >> wrote: >>> >>> Hi, >>> >>> I want to simulate many noisy brownian motion particles. So for each >>> particle I have to sum up random numbers repeatedly. I figured I'd >>> create a function that simulates one particle movement in cuda c and >>> import it to pycuda via SourceModule. Since I will be simulating many >>> particles repeatedly I want to initialize the random generators, >>> curandState, only once in the beginning and pass it to the function >>> for each run. >>> >>> I see that pycuda.curandom.XORWOWRandomNumberGenerator() is >>> initializing curandState but I'm not sure how I can access it and it >>> pass it as an argument to the cuda function. >>> >>> Any ideas on how best to go about doing that? >>> >>> Thomas >>> >>> _______________________________________________ >>> PyCUDA mailing list >>> [email protected] >>> http://lists.tiker.net/listinfo/pycuda >> >> > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
