Hi All,
I am not sure if this is the right place to post this (nvidia forum would
be better I guess?), but since I am running my cuda kernels inside python
(using pycuda) I am going to give this list a try.
My problem is that if I run:
cuda-memcheck python -m pycuda.debug <my_pycuda_code.py>
I end up with few global memory access violations that look like this
========= CUDA-MEMCHECK
*** compiler output in /tmp/tmpwmbHa1
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File
"/usr/local/lib/python2.7/dist-packages/pycuda-2012.1-py2.7-linux-x86_64.egg/pycuda/debug.py",
line 25, in <module>
execfile(mainpyfile)
File "./layers.py", line 239, in <module>
main()
File "./layers.py", line 233, in main
filt, s1, c1, s2, c2 = extract_all_layers(img, params, index_gpu)
File "./layers.py", line 213, in extract_all_layers
test= True)
File "./layers.py", line 117, in extract_s2_c2
s2_w += [s2_d_temp.get()]
File
"/usr/local/lib/python2.7/dist-packages/pycuda-2012.1-py2.7-linux-x86_64.egg/pycuda/gpuarray.py",
line 254, in get
drv.memcpy_dtoh(ary, self.gpudata)
pycuda._driver.LaunchError: cuMemcpyDtoH failed: launch failed
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: launch failed
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: launch failed
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: launch failed
========= Invalid __global__ read of size 4
========= at 0x00000970 in /tmp/tmpwmbHa1/kernel.cu:135
:extract_s2_matlab_investigation
========= by thread (31,20,0) in block (0,0,0)
========= Address 0xb00350d6c is out of bounds
=========
========= Invalid __global__ read of size 4
========= at 0x00000970 in /tmp/tmpwmbHa1/kernel.cu:135
:extract_s2_matlab_investigation
========= by thread (30,20,0) in block (0,0,0)
========= Address 0xb00350d68 is out of bounds
.....
========= Invalid __global__ read of size 4
========= at 0x00000970 in /tmp/tmpwmbHa1/kernel.cu:135
:extract_s2_matlab_investigation
========= by thread (0,0,0) in block (1,1,0)
========= Address 0xb00351730 is out of bounds
=========
========= Program hit error 700 on CUDA API call to cuMemcpyDtoH_v2
=========
========= Program hit error 700 on CUDA API call to cuMemFree_v2
=========
========= Program hit error 700 on CUDA API call to cuMemFree_v2
=========
========= Program hit error 700 on CUDA API call to cuModuleUnload
=========
========= ERROR SUMMARY: 1096 errors
Now, if I run my code (./<my_python_cuda.py>), everything works fine (no
crash), even if I run it thousands of times. The output that I am getting
also makes sense: I work in computer vision and I compare my pycuda code
with a scipy implementation of my algorithm and they both output the same
arrays.
Now, I used cuda-gdb on the pycuda code and put breakpoints where
cuda-memcheck tells me that global memory access violation occurs. I
checked all my variables and everything made sense (I run my kernel line by
line and I didn't "notice" any out of bounds access). Also, I tried to "set
cuda memcheck on" before starting to run my code on cuda-gdb and it didn't
stop where global memory access violation seems to occur according to
cuda-memcheck. Any ideas?
PS: I am new to debugging with cuda-gdb so please let me know if I am
missing something crucial related to cuda-memcheck
Thank you so much!!
Youssef
--
Youssef Barhomi, MSc, MEng.
Research Software Engineer at the CLPS department
Brown University
T: +1 (617) 797 9929 | GMT -5:00
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda