I'm experiencing bizarre behaviour while enabling cudnn. I'm able to compile code with and without cudnn enabled, but it seems that having it disabled yields faster compile time. My setup is as follows:
- Windows 10 - visual studio community 2013 - cuda 7.5 - cudnn 5.0. I'm testing on the code found on http://deeplearning.net/software/theano/tutorial/using_gpu.html and my .theanorc file resembles (without "<- message"): [global] floatX = float32 device = gpu allow_gc = False #optimizer_including = cudnn <- This can be enabled and disabled [lib] cnmem = 0.8 [dnn] #enabled = True <- This can be enabled and disabled [dnn.conv] algo_fwd = time_once algo_bwd_data = time_once algo_bwd_filter = time_once [blas] ldflags=-LC:\openblas -lopenblas [nvcc] flags = -LC:\Users\lukas\Anaconda3\libs compiler_bindir = "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin" <- This can be enabled and disabled with the above options configured I get: Using gpu device 0: GeForce 940M (CNMeM is enabled with initial size: 80.0% of memory, cuDNN not available) [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took 0.908416 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296] Used the gpu [Finished in 5.6s] which is the expected outcome and a fast compile time. When I set [global] ... optimizer_including = cudnn [dnn] enable = True I get the following error message Using gpu device 0: GeForce 940M (CNMeM is enabled with initial size: 80.0% of memory, cuDNN Can not compile with cuDNN. We got this error: b'') ERROR (theano.gof.opt): SeqOptimizer apply <theano.sandbox.cuda.dnn.NoCuDNNRaise object at 0x000002137DDDB278> ERROR (theano.gof.opt): Traceback: ERROR (theano.gof.opt): Traceback (most recent call last): File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\gof\opt.py", line 230, in apply sub_prof = optimizer.optimize(fgraph) File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\gof\opt.py", line 89, in optimize ret = self.apply(fgraph, *args, **kwargs) File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\sandbox\cuda\dnn.py", line 2564, in apply if not dnn_available(): File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\sandbox\cuda\__init__.py", line 340, in dnn_available dnn_available.msg) RuntimeError: You enabled cuDNN, but we aren't able to use it: Can not compile with cuDNN. We got this error: b'' ERROR (theano.gof.opt): SeqOptimizer apply <theano.sandbox.cuda.dnn.NoCuDNNRaise object at 0x000002137DDDB278> ERROR (theano.gof.opt): Traceback: ERROR (theano.gof.opt): Traceback (most recent call last): File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\gof\opt.py", line 230, in apply sub_prof = optimizer.optimize(fgraph) File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\gof\opt.py", line 89, in optimize ret = self.apply(fgraph, *args, **kwargs) File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\sandbox\cuda\dnn.py", line 2564, in apply if not dnn_available(): File "C:\Users\lukas\Anaconda3\lib\site-packages\theano\sandbox\cuda\__init__.py", line 340, in dnn_available dnn_available.msg) RuntimeError: You enabled cuDNN, but we aren't able to use it: Can not compile with cuDNN. We got this error: b'' This is resolved by setting [nvcc] ... # compiler_bindir = "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin" When running the same code now I get: DEBUG: nvcc STDOUT mod.cu Creating library C:/Users/lukas/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-SP0-Intel64_Family_6_Model_78_Stepping_3_GenuineIntel-3.5.2-64/tmpoavpotbu/m91973e5c136ea49268a916ff971b7377.lib and object C:/Users/lukas/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-SP0-Intel64_Family_6_Model_78_Stepping_3_GenuineIntel-3.5.2-64/tmpoavpotbu/m91973e5c136ea49268a916ff971b7377.exp Using gpu device 0: GeForce 940M (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5005) [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took 0.910419 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296] Used the gpu [Finished in 17.7s] I figure that I could ignore the message: "DEBUG: nvcc STDOUT mod.cu..." So this seems to resolve the problem but my compile time for this simple code has nearly tripled. Is there a way to have compiler_bindir enabled since that seems to yield faster compile time and does not produce the "DEBUG: nvcc STDOUT mod.cu message"? -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
