Sorry I haven't responded. Haven't gotten much time to work on the program.

Right now, I am using a different computer and when I run the program on it 
using fast_math=True, it works perfectly fine and doesn't seem to produce 
any NaNs (haven't seen the program run through all the way). The cpu also 
runs a lot slower on this computer, but probably because of hardware 
differences.

Is this just a Mac issue with theano? Or something on this computer?

Specs of the computer I am using now:

Windows 10 Home
Processor: Intel(R) Core(TM) i-5-2500 CPU @ 3.30GHz 3.30GHz
Installed Memory: 8 GB
System Type: 64-bit Operating System, x64-based processor
Graphics Card: NVIDIA GeForce GTX 770

--
Alexander McDowell

On Tuesday, December 6, 2016 at 5:16:41 PM UTC-8, Alexander McDowell wrote:
>
> For some reason when I try to run this 
> <https://github.com/nlml/adversarial-neural-crypt/blob/master/adversarial_neural_cryptography.py>
>  
> code with the gpu with nvcc.fastmath = True, it runs fine, but eventually 
> starts producing NaNs as a loss. It works fine when I run it on cpu but not 
> on the gpu. If I try to run it with nvcc.fastmath = False, it runs 
> perfectly well but the cpu version is considerably faster than the gpu 
> version. Does anyone know why this is?
>
> GPU result message (with fastmath = True):
>
> Building Models
>
> Training Model!
>
> Training with device = gpu
>
> Training on iteration #0
>
> Receiver Training Error: nan. Interceptor Training Error: 1.004785
>
> Training on iteration #100
>
> Receiver Training Error: nan. Interceptor Training Error: nan
>
>
> ... (keeps going)
>
>
> GPU result message (with fastmath = False):
>
>
> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not 
> available)
>
> Building Models
>
> Training Model!
>
> Training with device = gpu
>
> Training on iteration #0
>
> Receiver Training Error: 0.995444. Interceptor Training Error: 1.002399
>
> Training on iteration #100
>
> Receiver Training Error: 0.990433. Interceptor Training Error: 1.002779
>
> Training on iteration #200
>
> Receiver Training Error: 0.991761. Interceptor Training Error: 1.000185
>
>
> ... (keeps going)
>
>
> CPU result message:
>
>
> Building Models
>
> Training Model!
>
> Training with device = cpu
>
> Training on iteration #0
>
> Receiver Training Error: 0.994140. Interceptor Training Error: 1.002878
>
> Training on iteration #100
>
> Receiver Training Error: 1.004477. Interceptor Training Error: 0.997820
>
> Training on iteration #200
>
> Receiver Training Error: 0.998176. Interceptor Training Error: 1.001941
>
>
> ... (keeps going)
>
>
> I also have my .theanorc file:
>
>
> [global]
>
> device = gpu
>
> floatX = float32
>
> cxx = /Library/Developer/CommandLineTools/usr/bin/clang++
>
> optimizer=fast_compile
>
>
> [blas]
>
> blas.ldflags=
>
>
> [nvcc]
>
> fastmath = True
>
> nvcc.flags = -D_FORCE_INLINES
>
>
> [cuda]
>
> root = /usr/local/cuda/
>
>
>
> I also ran the CPU and GPU on the GPU Test program from here 
> <http://deeplearning.net/software/theano/tutorial/using_gpu.html> and got 
> the following results:
>
> GPU (with fastmath = True):
>
> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not 
> available)
>
> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), 
> HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
>
> Looping 1000 times took 0.856593 seconds
>
> Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761
>
>   1.62323296]
>
> Used the gpu
>
>
> GPU (with fastmath = False):
>
> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not 
> available)
>
> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), 
> HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
>
> Looping 1000 times took 0.872737 seconds
>
> Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761
>
>   1.62323296]
>
> Used the gpu
>
>
> CPU (using .theanorc):
>
> [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
>
> Looping 1000 times took 2.067907 seconds
>
> Result is [ 1.23178029  1.61879337  1.52278066 ...,  2.20771813  2.29967761
>
>   1.62323284]
>
> Used the cpu
>
> CPU (without .theanorc):
>
> [Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
>
> Looping 1000 times took 16.824746 seconds
>
> Result is [ 1.23178032  1.61879341  1.52278065 ...,  2.20771815  2.29967753
>
>   1.62323285]
>
> Used the cpu
>
>
>
> I also have my computer specs if needed:
>
> Mac OS Sierra, Version 10.12.1
> Processor: 2.9 GHz Intel Core i5
>
> Memory: 8 GB 1600 MHz DDR3
>
> Graphics Card: NVIDIA GeForce GT 650M 512 MB
>
> Thanks in advance!
> - Alexander McDowell
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to