For some reason when I try to run this <https://github.com/nlml/adversarial-neural-crypt/blob/master/adversarial_neural_cryptography.py> code with the gpu with nvcc.fastmath = True, it runs fine, but eventually starts producing NaNs as a loss. It works fine when I run it on cpu but not on the gpu. If I try to run it with nvcc.fastmath = False, it runs perfectly well but the cpu version is considerably faster than the gpu version. Does anyone know why this is?
GPU result message (with fastmath = True): Building Models Training Model! Training with device = gpu Training on iteration #0 Receiver Training Error: nan. Interceptor Training Error: 1.004785 Training on iteration #100 Receiver Training Error: nan. Interceptor Training Error: nan ... (keeps going) GPU result message (with fastmath = False): Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available) Building Models Training Model! Training with device = gpu Training on iteration #0 Receiver Training Error: 0.995444. Interceptor Training Error: 1.002399 Training on iteration #100 Receiver Training Error: 0.990433. Interceptor Training Error: 1.002779 Training on iteration #200 Receiver Training Error: 0.991761. Interceptor Training Error: 1.000185 ... (keeps going) CPU result message: Building Models Training Model! Training with device = cpu Training on iteration #0 Receiver Training Error: 0.994140. Interceptor Training Error: 1.002878 Training on iteration #100 Receiver Training Error: 1.004477. Interceptor Training Error: 0.997820 Training on iteration #200 Receiver Training Error: 0.998176. Interceptor Training Error: 1.001941 ... (keeps going) I also have my .theanorc file: [global] device = gpu floatX = float32 cxx = /Library/Developer/CommandLineTools/usr/bin/clang++ optimizer=fast_compile [blas] blas.ldflags= [nvcc] fastmath = True nvcc.flags = -D_FORCE_INLINES [cuda] root = /usr/local/cuda/ I also ran the CPU and GPU on the GPU Test program from here <http://deeplearning.net/software/theano/tutorial/using_gpu.html> and got the following results: GPU (with fastmath = True): Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available) [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took 0.856593 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296] Used the gpu GPU (with fastmath = False): Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not available) [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took 0.872737 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296] Used the gpu CPU (using .theanorc): [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)] Looping 1000 times took 2.067907 seconds Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761 1.62323284] Used the cpu CPU (without .theanorc): [Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)] Looping 1000 times took 16.824746 seconds Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 2.29967753 1.62323285] Used the cpu I also have my computer specs if needed: Mac OS Sierra, Version 10.12.1 Processor: 2.9 GHz Intel Core i5 Memory: 8 GB 1600 MHz DDR3 Graphics Card: NVIDIA GeForce GT 650M 512 MB Thanks in advance! - Alexander McDowell -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
