[theano-users] Theano code run with gpu produces NaN with nvcc.fastmath = True

Alexander McDowell Tue, 06 Dec 2016 17:17:13 -0800

For some reason when I try to run this 
<https://github.com/nlml/adversarial-neural-crypt/blob/master/adversarial_neural_cryptography.py>
 
code with the gpu with nvcc.fastmath = True, it runs fine, but eventually 
starts producing NaNs as a loss. It works fine when I run it on cpu but not 
on the gpu. If I try to run it with nvcc.fastmath = False, it runs 
perfectly well but the cpu version is considerably faster than the gpu 
version. Does anyone know why this is?


GPU result message (with fastmath = True):

Building Models

Training Model!

Training with device = gpu

Training on iteration #0

Receiver Training Error: nan. Interceptor Training Error: 1.004785

Training on iteration #100

Receiver Training Error: nan. Interceptor Training Error: nan


... (keeps going)


GPU result message (with fastmath = False):


Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available)

Building Models

Training Model!

Training with device = gpu

Training on iteration #0

Receiver Training Error: 0.995444. Interceptor Training Error: 1.002399

Training on iteration #100

Receiver Training Error: 0.990433. Interceptor Training Error: 1.002779

Training on iteration #200

Receiver Training Error: 0.991761. Interceptor Training Error: 1.000185


... (keeps going)


CPU result message:


Building Models

Training Model!

Training with device = cpu

Training on iteration #0

Receiver Training Error: 0.994140. Interceptor Training Error: 1.002878

Training on iteration #100

Receiver Training Error: 1.004477. Interceptor Training Error: 0.997820

Training on iteration #200

Receiver Training Error: 0.998176. Interceptor Training Error: 1.001941


... (keeps going)


I also have my .theanorc file:


[global]

device = gpu

floatX = float32

cxx = /Library/Developer/CommandLineTools/usr/bin/clang++

optimizer=fast_compile


[blas]

blas.ldflags=


[nvcc]

fastmath = True

nvcc.flags = -D_FORCE_INLINES


[cuda]

root = /usr/local/cuda/



I also ran the CPU and GPU on the GPU Test program from here 
<http://deeplearning.net/software/theano/tutorial/using_gpu.html> and got 
the following results:

GPU (with fastmath = True):

Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available)

[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), 
HostFromGpu(GpuElemwise{exp,no_inplace}.0)]

Looping 1000 times took 0.856593 seconds

Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761

  1.62323296]

Used the gpu


GPU (with fastmath = False):

Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available)

[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), 
HostFromGpu(GpuElemwise{exp,no_inplace}.0)]

Looping 1000 times took 0.872737 seconds

Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761

  1.62323296]

Used the gpu


CPU (using .theanorc):

[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]

Looping 1000 times took 2.067907 seconds

Result is [ 1.23178029  1.61879337  1.52278066 ...,  2.20771813  2.29967761

  1.62323284]

Used the cpu

CPU (without .theanorc):

[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]

Looping 1000 times took 16.824746 seconds

Result is [ 1.23178032  1.61879341  1.52278065 ...,  2.20771815  2.29967753

  1.62323285]

Used the cpu



I also have my computer specs if needed:

Mac OS Sierra, Version 10.12.1
Processor: 2.9 GHz Intel Core i5

Memory: 8 GB 1600 MHz DDR3

Graphics Card: NVIDIA GeForce GT 650M 512 MB

Thanks in advance!
- Alexander McDowell

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] Theano code run with gpu produces NaN with nvcc.fastmath = True

Reply via email to