Hello,
I am still fighting with my GeForce Titan (sm_35) on a computer running
Debian6 hence limited to Cuda4.2 (debian6 is our production platform
this is why I am sticking to it).
I managed to have any cuda code running but it systematically fails with pycuda.
Here is the dummy pycuda code:
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
sm=SourceModule("""
__global__ void square_array(float *a, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx<N) a[idx] = a[idx] * a[idx];
}"""
)
It works on the GT200 (by selecting CUDA_DEVICE=1) but fails on the GK110
lintaillefer:~ % PYCUDA_MAX_CC="sm_22" CUDA_DEVICE=0 python bug_pycuda.py
Using Arch sm_22
Segmentation fault
lintaillefer:~ % PYCUDA_MAX_CC="sm_21" CUDA_DEVICE=0 python bug_pycuda.py
Using Arch sm_21
Traceback (most recent call last):
File "bug_pycuda.py", line 9, in <module>
}"""
File "/usr/lib/python2.6/dist-packages/pycuda/compiler.py", line 288, in
__init__
self.module = module_from_buffer(cubin)
pycuda._driver.LogicError: cuModuleLoadDataEx failed: invalid source -
I debugged the whole python part and it looks OK.
The bug looks really to be in: module_from_buffer()
Cheers,
--
Jérôme Kieffer
On-Line Data analysis / Software Group
ISDD / ESRF
tel +33 476 882 445
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda