Received from Thomas Unterthiner on Fri, Jun 20, 2014 at 10:13:52AM EDT:
> Hi there!
> 
> I am currently chasing a very weird bug in my code: The following
> code will consistently crash on Kepler-type GPUs (tested on a Tesla
> K40 and on a GTX 780), but runs fine on my Fermi-class notebook GPU:
> 
>     import numpy as np
>     import pycuda.autoinit
>     from pycuda import gpuarray
>     from pycuda.driver import Stream
>     from scikits.cuda.cublas import cublasSgemm
>     import scikits.cuda.autoinit
>     from scikits.cuda.misc import _global_cublas_handle as handle
>     for _ in range(3):
>         n = 131
>         s = slice(128, n)
>         X = gpuarray.to_gpu(np.random.randn(n, 2483).astype(np.float32))
>         a = gpuarray.empty((X.shape[1], 3), dtype=np.float32)
>         c = gpuarray.empty((a.shape[0], X.shape[1]), dtype=np.float32)
>         b = gpuarray.empty_like(X)
>         m, n, k = a.shape[0], b[s].shape[1], a.shape[1]
>         lda, ldb, ldc = m, k, m
>         cublasSgemm(handle, 'n', 'n', m, n, k, 1.0, b[s].gpudata,
> lda, a.gpudata, ldb, 0.0, c.gpudata, ldc)
>     stream = Stream()
>     stream.synchronize()
> 
> The errors I'm getting are:
> 
>     Traceback (most recent call last):
>       File "<stdin>", line 22, in <module>
>     pycuda._driver.LogicError: cuStreamSynchronize failed:
> invalid/unknown error code
>     >>>
>     PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
>     cuStreamDestroy failed: invalid/unknown error code
>     PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
>     cuMemFree failed: invalid/unknown error code
>     PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
>     cuMemFree failed: invalid/unknown error code
>     PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
>     cuMemFree failed: invalid/unknown error code
>     PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
>     cuMemFree failed: invalid/unknown error code
> 
> the Stream - thing at the end of the code is only necessary to
> notice the error (by triggering error-checks). Copies to/from the
> device would also trigger the same errors. The bug is extremely
> weird, especially since:
> 
> * the used constants seem to matter. If I change 'n' to 132 the
> error goes away. If I change the 2nd dimension of X to 100 instead
> of 2483, it goes away as well.
> * the order of the allocations matter. If I allocate 'd' before 'c',
> the error goes away
> * the for-loop is necessary (i.e., the error only occurs at the
> third run-through)
> 
> Still, the error seems to be completely reproducable across
> different machines (tried on a machine running CentOS 6 on a K40, on
> a machine running Ubuntu 13.10 on a K40 and on a machine running
> Xubuntu 14.04 on a GTX 780).

When I tried running the above code against a K20Xm GPU with Ubuntu 14.04, CUDA
5.5, pycuda 2013.1.1 (stock Ubuntu package), and the latest code from
scikits.cuda master, I observed a different exception:

Traceback (most recent call last):
  File "cuda.py", line 12, in <module>
    X = gpuarray.to_gpu(np.random.randn(n, 2483).astype(np.float32))
  File "/usr/lib/python2.7/dist-packages/pycuda/gpuarray.py", line 913, in
  to_gpu
    result.set(ary)
  File "/usr/lib/python2.7/dist-packages/pycuda/gpuarray.py", line 228, in set
    drv.memcpy_htod(self.gpudata, ary)
pycuda._driver.LaunchError: cuMemcpyHtoD failed: launch failed

The exception occurs on the second iteration of the loop.

No errors observed when run against a Tesla S2050.
-- 
Lev Givon
Bionet Group | Neurokernel Project
http://www.columbia.edu/~lev/
http://lebedov.github.io/
http://neurokernel.github.io/


_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to