Hi all, You might recall a discussion we had on this list aabout a race conditions between constructors for data in shared and the rest of a CUDA kernel:
http://lists.tiker.net/pipermail/pycuda/2012-January/003615.html Nvidia added a warning to nvcc for CUDA 5 that helps prevent this issue. I also pinged Bryan Catanzaro about this, and he allowed me to forward his reply (thanks!): ----------------------------------------------------------- Hi Andreas - It's not actually a fix: it's an extra warning. """ nvcc test.cu -arch=sm_20 -DBAD test.cu(18): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads) (Errors follow) """ I think the consensus view is that code which does this is malformed, and should be rewritten. Instead of writing __shared__ non_pod x[BLOCKSIZE]; Which invokes the constructor repeatedly and wastefully (every thread calls the constructor for every element of the array) and causes races, you should invoke the constructors yourself. I've attached a simple example using placement new. So, it's not really a fix. Those of us who put non-plain-old-data in shared arrays need to be more explicit about how constructors are called. If you forget, the code will compile as it did before, but nvcc will warn you about the data race. Jared Hoberock has some thoughts on how to do this cleanly in C++: https://github.com/jaredhoberock/personal/tree/master/src/uninitialized ----------------------------------------------------------- His 'test.cu' is here: http://tiker.net/tmp/bryan-constructors-in-shared.cu Just FYI, Andreas _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
