So this confirm what Pascal wrote. It is the fast math flag that cause you problem.
We updated the doc to earn about this at a few places. Where did you found dog about fastmath that don't earn about this? Le 10 déc. 2016 20:45, "Alexander McDowell" <[email protected]> a écrit : > Wait, nevermind. The code is producing NaNs again. > > In response to your question, Using nvcc.fastmath I get this on the gpu > for 500 iterations of training: > > Using gpu device 0: GeForce GTX 770 (CNMeM is disabled, cuDNN not > available) > Building Models > Training Model! > Training with device = gpu > Training on iteration #0 > Receiver Training Error: nan. Interceptor Training Error: 1.006664 > Training on iteration #100 > Receiver Training Error: nan. Interceptor Training Error: nan > Training on iteration #200 > Receiver Training Error: nan. Interceptor Training Error: nan > Training on iteration #300 > Receiver Training Error: nan. Interceptor Training Error: nan > Training on iteration #400 > Receiver Training Error: nan. Interceptor Training Error: nan > Optimization complete! > The code for file Neural_Encryption.py ran for 2.39m > > And without fast_math: > > Using gpu device 0: GeForce GTX 770 (CNMeM is disabled, cuDNN not > available) > Building Models > Training Model! > Training with device = gpu > Training on iteration #0 > Receiver Training Error: 1.001337. Interceptor Training Error: 0.999701 > Training on iteration #100 > Receiver Training Error: 0.992031. Interceptor Training Error: 1.002571 > Training on iteration #200 > Receiver Training Error: 1.004744. Interceptor Training Error: 1.000874 > Training on iteration #300 > Receiver Training Error: 1.007841. Interceptor Training Error: 0.997157 > Training on iteration #400 > Receiver Training Error: 0.984059. Interceptor Training Error: 1.005130 > Optimization complete! > The code for file Neural_Encryption.py ran for 2.45m > > > On Saturday, December 10, 2016 at 11:00:10 AM UTC-8, Alexander McDowell > wrote: >> >> Sorry I haven't responded. Haven't gotten much time to work on the >> program. >> >> Right now, I am using a different computer and when I run the program on >> it using fast_math=True, it works perfectly fine and doesn't seem to >> produce any NaNs (haven't seen the program run through all the way). The >> cpu also runs a lot slower on this computer, but probably because of >> hardware differences. >> >> Is this just a Mac issue with theano? Or something on this computer? >> >> Specs of the computer I am using now: >> >> Windows 10 Home >> Processor: Intel(R) Core(TM) i-5-2500 CPU @ 3.30GHz 3.30GHz >> Installed Memory: 8 GB >> System Type: 64-bit Operating System, x64-based processor >> Graphics Card: NVIDIA GeForce GTX 770 >> >> -- >> Alexander McDowell >> >> On Tuesday, December 6, 2016 at 5:16:41 PM UTC-8, Alexander McDowell >> wrote: >>> >>> For some reason when I try to run this >>> <https://github.com/nlml/adversarial-neural-crypt/blob/master/adversarial_neural_cryptography.py> >>> code with the gpu with nvcc.fastmath = True, it runs fine, but eventually >>> starts producing NaNs as a loss. It works fine when I run it on cpu but not >>> on the gpu. If I try to run it with nvcc.fastmath = False, it runs >>> perfectly well but the cpu version is considerably faster than the gpu >>> version. Does anyone know why this is? >>> >>> GPU result message (with fastmath = True): >>> >>> Building Models >>> >>> Training Model! >>> >>> Training with device = gpu >>> >>> Training on iteration #0 >>> >>> Receiver Training Error: nan. Interceptor Training Error: 1.004785 >>> >>> Training on iteration #100 >>> >>> Receiver Training Error: nan. Interceptor Training Error: nan >>> >>> >>> ... (keeps going) >>> >>> >>> GPU result message (with fastmath = False): >>> >>> >>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not >>> available) >>> >>> Building Models >>> >>> Training Model! >>> >>> Training with device = gpu >>> >>> Training on iteration #0 >>> >>> Receiver Training Error: 0.995444. Interceptor Training Error: 1.002399 >>> >>> Training on iteration #100 >>> >>> Receiver Training Error: 0.990433. Interceptor Training Error: 1.002779 >>> >>> Training on iteration #200 >>> >>> Receiver Training Error: 0.991761. Interceptor Training Error: 1.000185 >>> >>> >>> ... (keeps going) >>> >>> >>> CPU result message: >>> >>> >>> Building Models >>> >>> Training Model! >>> >>> Training with device = cpu >>> >>> Training on iteration #0 >>> >>> Receiver Training Error: 0.994140. Interceptor Training Error: 1.002878 >>> >>> Training on iteration #100 >>> >>> Receiver Training Error: 1.004477. Interceptor Training Error: 0.997820 >>> >>> Training on iteration #200 >>> >>> Receiver Training Error: 0.998176. Interceptor Training Error: 1.001941 >>> >>> >>> ... (keeps going) >>> >>> >>> I also have my .theanorc file: >>> >>> >>> [global] >>> >>> device = gpu >>> >>> floatX = float32 >>> >>> cxx = /Library/Developer/CommandLineTools/usr/bin/clang++ >>> >>> optimizer=fast_compile >>> >>> >>> [blas] >>> >>> blas.ldflags= >>> >>> >>> [nvcc] >>> >>> fastmath = True >>> >>> nvcc.flags = -D_FORCE_INLINES >>> >>> >>> [cuda] >>> >>> root = /usr/local/cuda/ >>> >>> >>> >>> I also ran the CPU and GPU on the GPU Test program from here >>> <http://deeplearning.net/software/theano/tutorial/using_gpu.html> and >>> got the following results: >>> >>> GPU (with fastmath = True): >>> >>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not >>> available) >>> >>> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), >>> HostFromGpu(GpuElemwise{exp,no_inplace}.0)] >>> >>> Looping 1000 times took 0.856593 seconds >>> >>> Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 >>> 2.29967761 >>> >>> 1.62323296] >>> >>> Used the gpu >>> >>> >>> GPU (with fastmath = False): >>> >>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not >>> available) >>> >>> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), >>> HostFromGpu(GpuElemwise{exp,no_inplace}.0)] >>> >>> Looping 1000 times took 0.872737 seconds >>> >>> Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 >>> 2.29967761 >>> >>> 1.62323296] >>> >>> Used the gpu >>> >>> >>> CPU (using .theanorc): >>> >>> [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)] >>> >>> Looping 1000 times took 2.067907 seconds >>> >>> Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 >>> 2.29967761 >>> >>> 1.62323284] >>> >>> Used the cpu >>> >>> CPU (without .theanorc): >>> >>> [Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)] >>> >>> Looping 1000 times took 16.824746 seconds >>> >>> Result is [ 1.23178032 1.61879341 1.52278065 ..., 2.20771815 >>> 2.29967753 >>> >>> 1.62323285] >>> >>> Used the cpu >>> >>> >>> >>> I also have my computer specs if needed: >>> >>> Mac OS Sierra, Version 10.12.1 >>> Processor: 2.9 GHz Intel Core i5 >>> >>> Memory: 8 GB 1600 MHz DDR3 >>> >>> Graphics Card: NVIDIA GeForce GT 650M 512 MB >>> >>> Thanks in advance! >>> - Alexander McDowell >>> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
