Re: [theano-users] Re: Theano code run with gpu produces NaN with nvcc.fastmath = True

Frédéric Bastien Mon, 12 Dec 2016 00:36:07 -0800

So this confirm what Pascal wrote. It is the fast math flag that cause you
problem.


We updated the doc to earn about this at a few places. Where did you found
dog about fastmath that don't earn about this?

Le 10 déc. 2016 20:45, "Alexander McDowell" <[email protected]> a
écrit :

> Wait, nevermind. The code is producing NaNs again.
>
> In response to your question, Using nvcc.fastmath I get this on the gpu
> for 500 iterations of training:
>
> Using gpu device 0: GeForce GTX 770 (CNMeM is disabled, cuDNN not
> available)
> Building Models
> Training Model!
> Training with device = gpu
> Training on iteration #0
> Receiver Training Error: nan. Interceptor Training Error: 1.006664
> Training on iteration #100
> Receiver Training Error: nan. Interceptor Training Error: nan
> Training on iteration #200
> Receiver Training Error: nan. Interceptor Training Error: nan
> Training on iteration #300
> Receiver Training Error: nan. Interceptor Training Error: nan
> Training on iteration #400
> Receiver Training Error: nan. Interceptor Training Error: nan
> Optimization complete!
> The code for file Neural_Encryption.py ran for 2.39m
>
> And without fast_math:
>
> Using gpu device 0: GeForce GTX 770 (CNMeM is disabled, cuDNN not
> available)
> Building Models
> Training Model!
> Training with device = gpu
> Training on iteration #0
> Receiver Training Error: 1.001337. Interceptor Training Error: 0.999701
> Training on iteration #100
> Receiver Training Error: 0.992031. Interceptor Training Error: 1.002571
> Training on iteration #200
> Receiver Training Error: 1.004744. Interceptor Training Error: 1.000874
> Training on iteration #300
> Receiver Training Error: 1.007841. Interceptor Training Error: 0.997157
> Training on iteration #400
> Receiver Training Error: 0.984059. Interceptor Training Error: 1.005130
> Optimization complete!
> The code for file Neural_Encryption.py ran for 2.45m
>
>
> On Saturday, December 10, 2016 at 11:00:10 AM UTC-8, Alexander McDowell
> wrote:
>>
>> Sorry I haven't responded. Haven't gotten much time to work on the
>> program.
>>
>> Right now, I am using a different computer and when I run the program on
>> it using fast_math=True, it works perfectly fine and doesn't seem to
>> produce any NaNs (haven't seen the program run through all the way). The
>> cpu also runs a lot slower on this computer, but probably because of
>> hardware differences.
>>
>> Is this just a Mac issue with theano? Or something on this computer?
>>
>> Specs of the computer I am using now:
>>
>> Windows 10 Home
>> Processor: Intel(R) Core(TM) i-5-2500 CPU @ 3.30GHz 3.30GHz
>> Installed Memory: 8 GB
>> System Type: 64-bit Operating System, x64-based processor
>> Graphics Card: NVIDIA GeForce GTX 770
>>
>> --
>> Alexander McDowell
>>
>> On Tuesday, December 6, 2016 at 5:16:41 PM UTC-8, Alexander McDowell
>> wrote:
>>>
>>> For some reason when I try to run this
>>> <https://github.com/nlml/adversarial-neural-crypt/blob/master/adversarial_neural_cryptography.py>
>>> code with the gpu with nvcc.fastmath = True, it runs fine, but eventually
>>> starts producing NaNs as a loss. It works fine when I run it on cpu but not
>>> on the gpu. If I try to run it with nvcc.fastmath = False, it runs
>>> perfectly well but the cpu version is considerably faster than the gpu
>>> version. Does anyone know why this is?
>>>
>>> GPU result message (with fastmath = True):
>>>
>>> Building Models
>>>
>>> Training Model!
>>>
>>> Training with device = gpu
>>>
>>> Training on iteration #0
>>>
>>> Receiver Training Error: nan. Interceptor Training Error: 1.004785
>>>
>>> Training on iteration #100
>>>
>>> Receiver Training Error: nan. Interceptor Training Error: nan
>>>
>>>
>>> ... (keeps going)
>>>
>>>
>>> GPU result message (with fastmath = False):
>>>
>>>
>>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not
>>> available)
>>>
>>> Building Models
>>>
>>> Training Model!
>>>
>>> Training with device = gpu
>>>
>>> Training on iteration #0
>>>
>>> Receiver Training Error: 0.995444. Interceptor Training Error: 1.002399
>>>
>>> Training on iteration #100
>>>
>>> Receiver Training Error: 0.990433. Interceptor Training Error: 1.002779
>>>
>>> Training on iteration #200
>>>
>>> Receiver Training Error: 0.991761. Interceptor Training Error: 1.000185
>>>
>>>
>>> ... (keeps going)
>>>
>>>
>>> CPU result message:
>>>
>>>
>>> Building Models
>>>
>>> Training Model!
>>>
>>> Training with device = cpu
>>>
>>> Training on iteration #0
>>>
>>> Receiver Training Error: 0.994140. Interceptor Training Error: 1.002878
>>>
>>> Training on iteration #100
>>>
>>> Receiver Training Error: 1.004477. Interceptor Training Error: 0.997820
>>>
>>> Training on iteration #200
>>>
>>> Receiver Training Error: 0.998176. Interceptor Training Error: 1.001941
>>>
>>>
>>> ... (keeps going)
>>>
>>>
>>> I also have my .theanorc file:
>>>
>>>
>>> [global]
>>>
>>> device = gpu
>>>
>>> floatX = float32
>>>
>>> cxx = /Library/Developer/CommandLineTools/usr/bin/clang++
>>>
>>> optimizer=fast_compile
>>>
>>>
>>> [blas]
>>>
>>> blas.ldflags=
>>>
>>>
>>> [nvcc]
>>>
>>> fastmath = True
>>>
>>> nvcc.flags = -D_FORCE_INLINES
>>>
>>>
>>> [cuda]
>>>
>>> root = /usr/local/cuda/
>>>
>>>
>>>
>>> I also ran the CPU and GPU on the GPU Test program from here
>>> <http://deeplearning.net/software/theano/tutorial/using_gpu.html> and
>>> got the following results:
>>>
>>> GPU (with fastmath = True):
>>>
>>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not
>>> available)
>>>
>>> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>),
>>> HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
>>>
>>> Looping 1000 times took 0.856593 seconds
>>>
>>> Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813
>>> 2.29967761
>>>
>>>   1.62323296]
>>>
>>> Used the gpu
>>>
>>>
>>> GPU (with fastmath = False):
>>>
>>> Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN not
>>> available)
>>>
>>> [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>),
>>> HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
>>>
>>> Looping 1000 times took 0.872737 seconds
>>>
>>> Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813
>>> 2.29967761
>>>
>>>   1.62323296]
>>>
>>> Used the gpu
>>>
>>>
>>> CPU (using .theanorc):
>>>
>>> [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
>>>
>>> Looping 1000 times took 2.067907 seconds
>>>
>>> Result is [ 1.23178029  1.61879337  1.52278066 ...,  2.20771813
>>> 2.29967761
>>>
>>>   1.62323284]
>>>
>>> Used the cpu
>>>
>>> CPU (without .theanorc):
>>>
>>> [Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
>>>
>>> Looping 1000 times took 16.824746 seconds
>>>
>>> Result is [ 1.23178032  1.61879341  1.52278065 ...,  2.20771815
>>> 2.29967753
>>>
>>>   1.62323285]
>>>
>>> Used the cpu
>>>
>>>
>>>
>>> I also have my computer specs if needed:
>>>
>>> Mac OS Sierra, Version 10.12.1
>>> Processor: 2.9 GHz Intel Core i5
>>>
>>> Memory: 8 GB 1600 MHz DDR3
>>>
>>> Graphics Card: NVIDIA GeForce GT 650M 512 MB
>>>
>>> Thanks in advance!
>>> - Alexander McDowell
>>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Re: Theano code run with gpu produces NaN with nvcc.fastmath = True

Reply via email to