Hello pycuda-users:
I’m trying to compile the simplest code that uses dynamic parallelism using
the regular SorceModule, my code:
------------------------------------------------------------------------
import numpy as np
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import pycuda.autoinit
cudaCodeString = """
__global__ void ChildKernel(void* data){
//Operate on data
}
__global__ void ParentKernel(void *data){
if (threadIdx.x == 0) {
ChildKernel<<<1, 32>>>(data);
cudaThreadSynchronize();
}
__syncthreads();
//Operate on data
}
"""
cudaCode = SourceModule(cudaCodeString, options=['-rdc=true' ,'-lcudart' ],
arch='compute_35' )
-------------------------------------------------------------------------------
I get the next error:
---------------------------------------------------------------------------------
pycuda.driver.CompileError: nvcc compilation of /tmp/tmpJJo9kU/kernel.cu
failed
[command: nvcc --cubin -rdc=true -lcudart -arch compute_35 -I/usr
/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/
pycuda/cuda kernel.cu]
[stderr:
nvcc fatal : Option '--cubin (-cubin)' is not allowed when compiling for
a virtual compute architecture
-----------------------------------------------------------------------------------
CUDA version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0
Driver version: 331.38
--------------------------------------------------------------------------------------
Any ideas?
Is anyone successfully using dynamic parallelism with pycuda?
Thanks in advance.
Bruno
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda