Many thanks Fred: I updated Theano. I tested the same 3D convnet: it's converging using float32 but not float16. The test uses conv3d to classify three different patches extracted from 3D objects in a dataset.
Here are the outputs. flags: floatX = float32 device=gpu output: luca@cuda:~/data/DeepLearningTutorials/Theano-3D-ConvNet-master/convnet3d/core$ python Python 2.7.11 |Anaconda custom (64-bit)| (default, Dec 6 2015, 18:08:32) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> import run_multi_conv Mapped name None to device cuda: GeForce 840M Using cuDNN version 5005 on context None Using gpu device 0: GeForce 840M (CNMeM is disabled, cuDNN 5005) /home/luca/data/Theano-master/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module. "downsample module has been moved to the theano.tensor.signal.pool module.") >>> run_multi_conv.run_experiments() start time: 22/07/2016 10:35:36 images for training: 178 images for validation: 24 epochs: 200 ... training neural network 13 training @ iter = 0 training cost 1.09 epoch 1, training batch 178/178,validation error 71.67 % training @ iter = 200 training cost 1.09 epoch 2, training batch 178/178,validation error 71.67 % training @ iter = 400 training cost 1.09 epoch 3, training batch 178/178,validation error 71.67 % training @ iter = 600 training cost 1.09 epoch 4, training batch 178/178,validation error 71.67 % training @ iter = 800 training cost 1.08 epoch 5, training batch 178/178,validation error 71.25 % training @ iter = 1000 training cost 1.08 epoch 6, training batch 178/178,validation error 70.28 % training @ iter = 1200 training cost 1.08 epoch 7, training batch 178/178,validation error 68.69 % training @ iter = 1400 training cost 1.08 epoch 8, training batch 178/178,validation error 65.89 % training @ iter = 1600 training cost 1.07 epoch 9, training batch 178/178,validation error 63.24 % training cost 1.07 epoch 10, training batch 178/178,validation error 60.54 % training @ iter = 1800 training cost 1.06 epoch 11, training batch 178/178,validation error 57.35 % training @ iter = 2000 training cost 1.06 epoch 12, training batch 178/178,validation error 53.89 % training @ iter = 2200 training cost 1.05 epoch 13, training batch 178/178,validation error 50.77 % training @ iter = 2400 training cost 1.04 epoch 14, training batch 178/178,validation error 47.65 % training @ iter = 2600 training cost 1.04 epoch 15, training batch 178/178,validation error 44.89 % training @ iter = 2800 training cost 1.03 epoch 16, training batch 178/178,validation error 42.34 % training @ iter = 3000 training cost 1.01 epoch 17, training batch 178/178,validation error 40.12 % training @ iter = 3200 training cost 1.00 epoch 18, training batch 178/178,validation error 37.92 % training cost 0.99 epoch 19, training batch 178/178,validation error 35.96 % training @ iter = 3400 ---------- flags: floatX = float16 device=cuda output: Python 2.7.11 |Anaconda custom (64-bit)| (default, Dec 6 2015, 18:08:32) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> runfile('/home/luca/data/DeepLearningTutorials/Theano-3D-ConvNet-master/convnet3d/core/run_multi_conv.py', wdir='/home/luca/data/DeepLearningTutorials/Theano-3D-ConvNet-master/convnet3d/core') Mapped name None to device cuda: GeForce 840M Using cuDNN version 5005 on context None /home/luca/data/Theano-master/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module. "downsample module has been moved to the theano.tensor.signal.pool module.") Disabling C code for Elemwise{mul,no_inplace} due to unsupported float16 Disabling C code for Elemwise{Cast{float32}} due to unsupported float16 Disabling C code for Elemwise{Cast{float16}} due to unsupported float16 Disabling C code for Elemwise{Cast{float16}} due to unsupported float16 Disabling C code for Alloc due to unsupported float16 Disabling C code for DiagonalSubtensor{inplace} due to unsupported float16 Disabling C code for IncDiagonalSubtensor due to unsupported float16 Disabling C code for DiagonalSubtensor{inplace} due to unsupported float16 Disabling C code for MaxAndArgmax due to unsupported float16 start time: 22/07/2016 11:04:55 images for training: 178 images for validation: 24 epochs: 200 ... training neural network 13 training @ iter = 0 training cost nan epoch 1, training batch 178/178,validation error 67.50 % training @ iter = 200 training cost nan epoch 2, training batch 178/178,validation error 67.50 % training @ iter = 400 training cost nan epoch 3, training batch 178/178,validation error 67.50 % training @ iter = 600 training cost nan epoch 4, training batch 178/178,validation error 67.50 % training @ iter = 800 training cost nan epoch 5, training batch 178/178,validation error 67.50 % training @ iter = 1000 training cost nan epoch 6, training batch 178/178,validation error 67.50 % training @ iter = 1200 training cost nan epoch 7, training batch 178/178,validation error 67.50 % training @ iter = 1400 training cost nan epoch 8, training batch 178/178,validation error 67.50 % training @ iter = 1600 training cost nan epoch 9, training batch 178/178,validation error 67.50 % training cost nan epoch 10, training batch 178/178,validation error 67.50 % training @ iter = 1800 training cost nan epoch 11, training batch 178/178,validation error 67.50 % training @ iter = 2000 training cost nan epoch 12, training batch 178/178,validation error 67.50 % training @ iter = 2200 training cost nan epoch 13, training batch 178/178,validation error 67.50 % training @ iter = 2400 training cost nan epoch 14, training batch 178/178,validation error 67.50 % training @ iter = 2600 training cost nan epoch 15, training batch 178/178,validation error 67.50 % training @ iter = 2800 training cost nan epoch 16, training batch 178/178,validation error 67.50 % ------------- -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
