Hello !

CUDA now expects the trans parameter to be an integer: 
http://docs.nvidia.com/cuda/cusolver/index.html#cuds-lt-t-gt-getrs

So I think there is an error in scikit-cuda. In fact it is discussed here : 
https://github.com/lebedov/scikit-cuda/issues/191#issuecomment-288728423
But instead of encoding trans, he should better convert it to an integer as 
it is done in cublas module: 
https://github.com/lebedov/scikit-cuda/blob/master/skcuda/cublas.py#L111

So, either cusolver module should expect trans to be an integer (to pass it 
directly to CUDA functions), either it expects a string and it should then 
convert it to an integer before calling CUDA functions (in this case we 
will have to update Theano GPU cusolver ops, too).

Meanwhile, I suggest you comment line 637 into scikit-cuda cusolver module:

# trans = trans.encode('ascii')

And it should work.

Le mercredi 12 avril 2017 14:06:31 UTC-4, Peter St. John a écrit :
>
> This is partly related to #5844 
> <https://github.com/Theano/Theano/issues/5844>, but I'm getting errors 
> when using theano's gpu_solve method:
>
> Here's the program I'm trying to get to work:
>
> import numpy as np
>
> import theano
> from theano import tensor
> import theano.tensor.slinalg as slinalg
>
> floatX = theano.config.floatX
>
> from theano.gpuarray.linalg import GpuCusolverSolve
>
> np.random.seed(0)
> A_val = np.random.randn(10, 5, 5).astype(floatX)
> b_val = np.random.randn(10, 5, 1).astype(floatX)
>
> x_np = np.linalg.solve(A_val, b_val)
>
> A = tensor.ftensor3()
> A.tag.test_value = A_val
> b = tensor.ftensor3()
> b.tag.test_value = b_val
>
> x, _ = theano.scan(
>     lambda Ai, bi: slinalg.solve(Ai, bi),
>     sequences=[A, b])
>
> fn = theano.function([A, b], [x])
> x_theano = fn(A_val, b_val)
>
> assert np.allclose(x_theano, x_np)
>
>
> When I run it on the CPU, everything works 'fine', except for some minor 
> complaints:
> $ THEANO_FLAGS=device=cpu python gpu_solve_test.py
> /home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/nose_parameterized/__init__.py:7:
>  
> UserWarning: The 'nose-parameterized' package has been renamed 
> 'parameterized'. For the two step migration instructions, see: 
> https://github.com/wolever/parameterized#migrating-from-nose-parameterized-to-parameterized
>  
> (set NOSE_PARAMETERIZED_NO_WARN=1 to suppress this warning)
>   "The 'nose-parameterized' package has been renamed 'parameterized'. "
> Can not use cuDNN on context None: cannot compile with cuDNN. We got this 
> error:
> b'/tmp/try_flags_13i_4vht.c:4:19: fatal error: cudnn.h: No such file or 
> directory\n #include <cudnn.h>\n                   ^\ncompilation 
> terminated.\n'
> Mapped name None to device cuda: Tesla K40c (0000:04:00.0)
>
>
> But on the GPU, the linear solve fails:
> $ THEANO_FLAGS=device=cuda0 python gpu_solve_test.py
> Can not use cuDNN on context None: cannot compile with cuDNN. We got this 
> error:
> b'/tmp/try_flags_o6g40qsx.c:4:19: fatal error: cudnn.h: No such file or 
> directory\n #include <cudnn.h>\n                   ^\ncompilation 
> terminated.\n'
> Mapped name None to device cuda0: Tesla K40c (0000:04:00.0)
> /home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/nose_parameterized/__init__.py:7:
>  
> UserWarning: The 'nose-parameterized' package has been renamed 
> 'parameterized'. For the two step migration instructions, see: 
> https://github.com/wolever/parameterized#migrating-from-nose-parameterized-to-parameterized
>  
> (set NOSE_PARAMETERIZED_NO_WARN=1 to suppress this warning)
>   "The 'nose-parameterized' package has been renamed 'parameterized'. "
> Traceback (most recent call last):
>   File "theano/scan_module/scan_perform.pyx", line 397, in 
> theano.scan_module.scan_perform.perform 
> (/home/pstjohn/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.6.1-64/scan_perform/mod.cpp:4490)
>   File "/home/pstjohn/Packages/Theano/theano/gof/op.py", line 888, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/pstjohn/Packages/Theano/theano/gpuarray/linalg.py", line 
> 208, in perform
>     pivots_ptr, b_ptr, ldb, dev_info_ptr)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/skcuda/cusolver.py",
>  
> line 637, in cusolverDnSgetrs
>     trans = trans.encode('ascii')
> AttributeError: 'int' object has no attribute 'encode'
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "/home/pstjohn/Packages/Theano/theano/compile/function_module.py", 
> line 884, in __call__
>     self.fn() if output_subset is None else\
>   File "/home/pstjohn/Packages/Theano/theano/scan_module/scan_op.py", line 
> 963, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/pstjohn/Packages/Theano/theano/scan_module/scan_op.py", line 
> 952, in p
>     self, node)
>   File "theano/scan_module/scan_perform.pyx", line 405, in 
> theano.scan_module.scan_perform.perform 
> (/home/pstjohn/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.6.1-64/scan_perform/mod.cpp:4606)
>   File "/home/pstjohn/Packages/Theano/theano/gof/link.py", line 325, in 
> raise_with_op
>     reraise(exc_type, exc_value, exc_trace)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/six.py", 
> line 685, in reraise
>     raise value.with_traceback(tb)
>   File "theano/scan_module/scan_perform.pyx", line 397, in 
> theano.scan_module.scan_perform.perform 
> (/home/pstjohn/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.6.1-64/scan_perform/mod.cpp:4490)
>   File "/home/pstjohn/Packages/Theano/theano/gof/op.py", line 888, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/pstjohn/Packages/Theano/theano/gpuarray/linalg.py", line 
> 208, in perform
>     pivots_ptr, b_ptr, ldb, dev_info_ptr)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/skcuda/cusolver.py",
>  
> line 637, in cusolverDnSgetrs
>     trans = trans.encode('ascii')
> AttributeError: 'int' object has no attribute 'encode'
> Apply node that caused the error: GpuCusolverSolve{A_structure='general', 
> trans='N', inplace=False}(GpuContiguous.0, GpuContiguous.0)
> Toposort index: 2
> Inputs types: [GpuArrayType<None>(float32, matrix), 
> GpuArrayType<None>(float32, matrix)]
> Inputs shapes: [(5, 5), (5, 1)]
> Inputs strides: [(20, 4), (4, 4)]
> Inputs values: ['not shown', gpuarray.array([[ 0.52106488],
>        [-0.57578796],
>        [ 0.14195317],
>        [-0.31932843],
>        [ 0.69153875]], dtype=float32)]
> Outputs clients: [['output']]
>
> HINT: Re-running with most Theano optimization disabled could give you a 
> back-trace of when this node was created. This can be done with by setting 
> the Theano flag 'optimizer=fast_compile'. If that does not work, Theano 
> optimizations can be disabled with 'optimizer=None'.
> HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and 
> storage map footprint of this apply node.
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "gpu_solve_test.py", line 27, in <module>
>     x_theano = fn(A_val, b_val)
>   File "/home/pstjohn/Packages/Theano/theano/compile/function_module.py", 
> line 898, in __call__
>     storage_map=getattr(self.fn, 'storage_map', None))
>   File "/home/pstjohn/Packages/Theano/theano/gof/link.py", line 325, in 
> raise_with_op
>     reraise(exc_type, exc_value, exc_trace)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/six.py", 
> line 685, in reraise
>     raise value.with_traceback(tb)
>   File "/home/pstjohn/Packages/Theano/theano/compile/function_module.py", 
> line 884, in __call__
>     self.fn() if output_subset is None else\
>   File "/home/pstjohn/Packages/Theano/theano/scan_module/scan_op.py", line 
> 963, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/pstjohn/Packages/Theano/theano/scan_module/scan_op.py", line 
> 952, in p
>     self, node)
>   File "theano/scan_module/scan_perform.pyx", line 405, in 
> theano.scan_module.scan_perform.perform 
> (/home/pstjohn/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.6.1-64/scan_perform/mod.cpp:4606)
>   File "/home/pstjohn/Packages/Theano/theano/gof/link.py", line 325, in 
> raise_with_op
>     reraise(exc_type, exc_value, exc_trace)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/six.py", 
> line 685, in reraise
>     raise value.with_traceback(tb)
>   File "theano/scan_module/scan_perform.pyx", line 397, in 
> theano.scan_module.scan_perform.perform 
> (/home/pstjohn/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.6.1-64/scan_perform/mod.cpp:4490)
>   File "/home/pstjohn/Packages/Theano/theano/gof/op.py", line 888, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/pstjohn/Packages/Theano/theano/gpuarray/linalg.py", line 
> 208, in perform
>     pivots_ptr, b_ptr, ldb, dev_info_ptr)
>   File 
> "/home/pstjohn/miniconda3/envs/pymc3/lib/python3.6/site-packages/skcuda/cusolver.py",
>  
> line 637, in cusolverDnSgetrs
>     trans = trans.encode('ascii')
> AttributeError: 'int' object has no attribute 'encode'
> Apply node that caused the error: GpuCusolverSolve{A_structure='general', 
> trans='N', inplace=False}(GpuContiguous.0, GpuContiguous.0)
> Toposort index: 2
> Inputs types: [GpuArrayType<None>(float32, matrix), 
> GpuArrayType<None>(float32, matrix)]
> Inputs shapes: [(5, 5), (5, 1)]
> Inputs strides: [(20, 4), (4, 4)]
> Inputs values: ['not shown', gpuarray.array([[ 0.52106488],
>        [-0.57578796],
>        [ 0.14195317],
>        [-0.31932843],
>        [ 0.69153875]], dtype=float32)]
> Outputs clients: [['output']]
>
> HINT: Re-running with most Theano optimization disabled could give you a 
> back-trace of when this node was created. This can be done with by setting 
> the Theano flag 'optimizer=fast_compile'. If that does not work, Theano 
> optimizations can be disabled with 'optimizer=None'.
> HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and 
> storage map footprint of this apply node.
> Apply node that caused the error: 
> for{gpu,scan_fn}(Elemwise{minimum,no_inplace}.0, 
> GpuSubtensor{int64:int64:int8}.0, GpuSubtensor{int64:int64:int8}.0, 
> Elemwise{minimum,no_inplace}.0)
> Toposort index: 20
> Inputs types: [TensorType(int64, scalar), GpuArrayType<None>(float32, 3D), 
> GpuArrayType<None>(float32, 3D), TensorType(int64, scalar)]
> Inputs shapes: [(), (10, 5, 5), (10, 5, 1), ()]
> Inputs strides: [(), (100, 20, 4), (20, 4, 4), ()]
> Inputs values: [array(10), 'not shown', 'not shown', array(10)]
> Outputs clients: [[HostFromGpu(gpuarray)(for{gpu,scan_fn}.0)]]
>
> HINT: Re-running with most Theano optimization disabled could give you a 
> back-trace of when this node was created. This can be done with by setting 
> the Theano flag 'optimizer=fast_compile'. If that does not work, Theano 
> optimizations can be disabled with 'optimizer=None'.
> HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and 
> storage map footprint of this apply node.
>
>
> Is there a separate issue from the 'trans' error from using scan? 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to