Hi all, Ahmed Fasih <[email protected]> writes: > Hi folks, I write in the hope that someone has gotten a K20 Kepler 3.5 > compute capability device and has gotten it to do dynamic parallelism, > wherein a kernel can kick off grids on its own without returning to > the CPU. A "hello world" example is given at [1], page 23. With the > suggested nvcc command, I was able to compile and run the hello world > perfectly, but when I tried to load the kernels into PyCUDA, I get > LogicErrors, asserting > > "cuModuleLoadDataEx failed: invalid image -" > > This is before and after I take out the runtime API calls like > cudaDeviceSynchronize() in the kernel code, and persist after I pass > the "-rdc=true" and "-lcudadevrt" options into SourceModule. Could > PyCUDA be hating the runtime API calls in a kernel? If anybody's > gotten this to work, please share! > > On a related note, I've added support for the sm_35 (30?) feature of > choosing four versus eight byte banks for shared memory from PyCUDA, I > can send out a rough patch. > > [1] > http://docs.nvidia.com/cuda/pdf/CUDA_Dynamic_Parallelism_Programming_Guide.pdf > > For completeness, here's the original CUDA code from there: > > #include <stdio.h> > __global__ void childKernel() > { > printf("Hello "); > } > __global__ void parentKernel() > { > // launch child > childKernel<<<1,1>>>(); > if (cudaSuccess != cudaGetLastError()) { > return; > } > // wait for child to complete > if (cudaSuccess != cudaDeviceSynchronize()) { > return; > } > printf("World!\n"); > } > int main(int argc, char *argv[]) > { > // launch parent > parentKernel<<<1,1>>>(); > if (cudaSuccess != cudaGetLastError()) { > return 1; > } > // wait for parent to complete > if (cudaSuccess != cudaDeviceSynchronize()) { > return 2; > } > return 0; > } > > > And compiled via: > $ nvcc -arch=sm_35 -rdc=true hello_world.cu -o hello -lcudadevrt > > (copied & pasted from NVIDIA doc [1], all rights reserved by NVIDIA etc.)
Sorry about the thread necromancy. I think I've got a lead on how to make this happen, here: https://github.com/inducer/pycuda/issues/45#issuecomment-48953922 Andreas
pgpBavep8bLMX.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
