Re: [theano-users] Error using floatX = float16 to save memory

Frédéric Bastien Thu, 06 Oct 2016 13:27:44 -0700

For float16, always use
device=cuda

Not device=gpu. This could be your problem. Can you test that?


thanks

Fred

On Tue, Oct 4, 2016 at 10:21 AM, <luca.wagner.0...@gmail.com> wrote:

> Hi Fred,
>  I tested the convnet using
>
>  floatX= float32,
> device=gpu
> theano.tensor.nnet.conv3d2d.conv3d
> updated theano/sandbox/cuda/blas.py  downloaded from
> https://github.com/Theano/Theano/pull/5050
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FTheano%2FTheano%2Fpull%2F5050&sa=D&sntz=1&usg=AFQjCNFR8fDmqOsSdghS2mC-7R90hku9Rw>
>
> The convnet converges:
>
> Python 2.7.12 |Anaconda custom (64-bit)| (default, Jul  2 2016, 17:42:40)
> [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Anaconda is brought to you by Continuum Analytics.
> Please check out: http://continuum.io/thanks and https://anaconda.org
> >>> runfile('/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py',
> wdir='/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core')
> Using gpu device 0: GeForce 840M (CNMeM is disabled, cuDNN 5103)
> /home/luca/data/Theano-master/theano/tensor/signal/downsample.py:6:
> UserWarning: downsample module has been moved to the
> theano.tensor.signal.pool module.
>   "downsample module has been moved to the theano.tensor.signal.pool
> module.")
>
>
> start time:
> 04/10/2016
> 16:18:13
>
>
> Images for training: 316
> Images for validation: 56
>
> training @ iter =  0
> training cost 0.69672
> epoch 1, training batch 316/316, validation error 37.500 %
> ------
>
> If I make the same test using:
> floatX = float16
> device=gpu
> theano.tensor.nnet.conv3d2d.conv3d
> updated theano/sandbox/cuda/blas.py  downloaded from
> https://github.com/Theano/Theano/pull/5050
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FTheano%2FTheano%2Fpull%2F5050&sa=D&sntz=1&usg=AFQjCNFR8fDmqOsSdghS2mC-7R90hku9Rw>
>
> I have an error running the  convnet:
> Python 2.7.12 |Anaconda custom (64-bit)| (default, Jul  2 2016, 17:42:40)
> [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Anaconda is brought to you by Continuum Analytics.
> Please check out: http://continuum.io/thanks and https://anaconda.org
> >>> runfile('/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py',
> wdir='/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core')
> Using gpu device 0: GeForce 840M (CNMeM is disabled, cuDNN 5103)
> /home/luca/data/Theano-master/theano/tensor/signal/downsample.py:6:
> UserWarning: downsample module has been moved to the
> theano.tensor.signal.pool module.
>   "downsample module has been moved to the theano.tensor.signal.pool
> module.")
> Disabling C code for Elemwise{mul,no_inplace} due to unsupported float16
> Disabling C code for Elemwise{Cast{float32}} due to unsupported float16
> Disabling C code for Elemwise{Cast{float16}} due to unsupported float16
> Disabling C code for Elemwise{Cast{float16}} due to unsupported float16
> Disabling C code for Alloc due to unsupported float16
> Disabling C code for Elemwise{abs_,no_inplace} due to unsupported float16
> Disabling C code for Sum{acc_dtype=float32} due to unsupported float16
> Disabling C code for mrg_uniform{TensorType(float16, matrix),inplace} due
> to unsupported float16
> Disabling C code for mrg_uniform{TensorType(float16, matrix),inplace} due
> to unsupported float16
> Disabling C code for Elemwise{Composite{(-Cast{float16}((i0 / i1)))}} due
> to unsupported float16
> Disabling C code for Elemwise{Composite{Cast{float16}(Cast{int64}(LT(i0,
> i1)))}}[(0, 0)] due to unsupported float16
> Disabling C code for Elemwise{Composite{Cast{float16}(Cast{int64}(LT(i0,
> i1)))}}[(0, 0)] due to unsupported float16
> Disabling C code for CorrMM{valid, (1, 1), (1, 1)} due to unsupported
> float16
> Disabling C code for DiagonalSubtensor{inplace} due to unsupported float16
> Disabling C code for Sum{axis=[3], acc_dtype=float32} due to unsupported
> float16
> Disabling C code for Elemwise{Add}[(0, 0)] due to unsupported float16
> Disabling C code for sigmoid due to unsupported float16
> Disabling C code for Pool{ds=(3, 3), ignore_border=True, st=(3, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for Pool{ds=(1, 3), ignore_border=True, st=(1, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for Elemwise{Composite{scalar_sigmoid((i0 + i1))}}[(0,
> 0)] due to unsupported float16
> Disabling C code for Elemwise{mul,no_inplace} due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for CrossentropySoftmaxArgmax1HotWithBias due to
> unsupported float16
> Disabling C code for CrossentropySoftmax1HotWithBiasDx due to unsupported
> float16
> Disabling C code for Sum{acc_dtype=float32} due to unsupported float16
> Disabling C code for Sum{axis=[0], acc_dtype=float32} due to unsupported
> float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * i2))}}[(0, 0)] due to
> unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * ((i2 * i0) + i3 + (i4
> * sgn(i0)))))}}[(0, 3)] due to unsupported float16
> Disabling C code for Elemwise{Composite{Cast{float16}(((i0 - i1) * i1 *
> i2 * i3 * i4))}}[(0, 1)] due to unsupported float16
> Disabling C code for Elemwise{Sqr}[(0, 0)] due to unsupported float16
> Disabling C code for Sum{axis=[0], acc_dtype=float32} due to unsupported
> float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for Sum{acc_dtype=float32} due to unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * i2))}}[(0, 0)] due to
> unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * i2))}}[(0, 0)] due to
> unsupported float16
> Disabling C code for Elemwise{Composite{((-Cast{float16}(((-i0) / i1))) +
> (i2 * i3) + (i2 * i4))}}[(0, 0)] due to unsupported float16
> Disabling C code for MaxPoolGrad{ds=(1, 3), ignore_border=True, st=(1, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for MaxPoolGrad{ds=(3, 3), ignore_border=True, st=(3, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for Elemwise{Composite{Cast{float16}(((i0 -
> scalar_sigmoid(i1)) * i2 * scalar_sigmoid(i1)))}}[(0, 1)] due to
> unsupported float16
> Disabling C code for Sum{axis=[1, 2, 3], acc_dtype=float32} due to
> unsupported float16
> Disabling C code for Alloc due to unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * i2))}}[(0, 0)] due to
> unsupported float16
> Disabling C code for IncDiagonalSubtensor due to unsupported float16
> Disabling C code for CorrMM_gradWeights{valid, (1, 1), (1, 1)} due to
> unsupported float16
> Disabling C code for Elemwise{Composite{(i0 - (i1 * i2))}}[(0, 0)] due to
> unsupported float16
> Disabling C code for mrg_uniform{TensorType(float16, matrix),inplace} due
> to unsupported float16
> Disabling C code for mrg_uniform{TensorType(float16, matrix),inplace} due
> to unsupported float16
> Disabling C code for CorrMM{valid, (1, 1), (1, 1)} due to unsupported
> float16
> Disabling C code for DiagonalSubtensor{inplace} due to unsupported float16
> Disabling C code for Sum{axis=[3], acc_dtype=float32} due to unsupported
> float16
> Disabling C code for Elemwise{Add}[(0, 0)] due to unsupported float16
> Disabling C code for Elemwise{ScalarSigmoid}[(0, 0)] due to unsupported
> float16
> Disabling C code for Pool{ds=(3, 3), ignore_border=True, st=(3, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for Pool{ds=(1, 3), ignore_border=True, st=(1, 3),
> padding=(0, 0), mode='max'} due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for Elemwise{Composite{(Cast{float16}(Cast{int64}(LT(i0,
> i1))) * scalar_sigmoid((i2 + i3)) * Cast{float16}(Cast{int64}(LT(i4,
> i1))))}}[(0, 0)] due to unsupported float16
> Disabling C code for dot due to unsupported float16
> Disabling C code for Elemwise{Add}[(0, 0)] due to unsupported float16
> Disabling C code for MaxAndArgmax due to unsupported float16
>
>
> start time:
> 04/10/2016
> 16:20:07
>
>
> Images for training: 316
> Images for validation: 56
> Epochs: 100
>
>
> training @ iter =  0
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/home/luca/anaconda2/lib/python2.7/site-packages/
> spyderlib/widgets/externalshell/sitecustomize.py", line 714, in runfile
>     execfile(filename, namespace)
>   File "/home/luca/anaconda2/lib/python2.7/site-packages/
> spyderlib/widgets/externalshell/sitecustomize.py", line 81, in execfile
>     builtins.execfile(filename, *where)
>   File "/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py", line 32, in
> <module>
>     run_experiments()
>   File "/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py", line 25, in
> run_experiments
>     Learning_rate=0.001
>   File "mpr_convnet_class_t.py", line 266, in __init__
>     training_cost_ij=train_model(a, b)
>   File "/home/luca/data/Theano-master/theano/compile/function_module.py",
> line 879, in __call__
>     storage_map=getattr(self.fn, 'storage_map', None))
>   File "/home/luca/data/Theano-master/theano/gof/link.py", line 325, in
> raise_with_op
>     reraise(exc_type, exc_value, exc_trace)
>   File "/home/luca/data/Theano-master/theano/compile/function_module.py",
> line 866, in __call__
>     self.fn() if output_subset is None else\
>   File "/home/luca/data/Theano-master/theano/gof/op.py", line 908, in rval
>     r = p(n, [x[0] for x in i], o)
>   File "/home/luca/data/Theano-master/theano/gof/op.py", line 762, in
> perform
>     "Did you used Theano flags mode=FAST_COMPILE?"
> theano.gof.utils.MethodNotDefined: ('perform', <class
> 'theano.tensor.nnet.corr.CorrMM'>, 'CorrMM', 'Did you used Theano flags
> mode=FAST_COMPILE? You can use optimizer=fast_compile instead.')
> Apply node that caused the error: CorrMM{valid, (1, 1), (1,
> 1)}(InplaceDimShuffle{0,x,1,2}.0, Subtensor{::, ::, ::int64, ::int64}.0)
> Toposort index: 30
> Inputs types: [TensorType(float16, (False, True, False, False)),
> TensorType(float16, (False, True, False, False))]
> Inputs shapes: [(24, 1, 24, 24), (200, 1, 5, 5)]
> Inputs strides: [(2, 1152, 48, 1152), (50, 50, -10, -2)]
> Inputs values: ['not shown', 'not shown']
> Outputs clients: [[Reshape{5}(CorrMM{valid, (1, 1), (1, 1)}.0,
> TensorConstant{[24 40  5 20 20]})]]
>
> Backtrace when the node is created(use Theano flag traceback.limit=N to
> make it longer):
>   File "<stdin>", line 1, in <module>
>   File "/home/luca/anaconda2/lib/python2.7/site-packages/
> spyderlib/widgets/externalshell/sitecustomize.py", line 714, in runfile
>     execfile(filename, namespace)
>   File "/home/luca/anaconda2/lib/python2.7/site-packages/
> spyderlib/widgets/externalshell/sitecustomize.py", line 81, in execfile
>     builtins.execfile(filename, *where)
>   File "/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py", line 32, in
> <module>
>     run_experiments()
>   File "/home/luca/data/DeepLearningTutorials/Theano-
> 3D-ConvNet-master/convnet3d/core/run_multi_conv_t.py", line 25, in
> run_experiments
>     Learning_rate=0.001
>   File "mpr_convnet_class_t.py", line 169, in __init__
>     b )
>   File "cuddn_convnet3d.py", line 90, in __init__
>     border_mode='valid')
>
> HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and
> storage map footprint of this apply node.
>
>
>
>
>
>
>
> On Monday, October 3, 2016 at 11:02:09 PM UTC+2, nouiz wrote:
>>
>> I have a fix in the new back-end about the error:
>>
>> https://github.com/Theano/Theano/pull/5050
>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FTheano%2FTheano%2Fpull%2F5050&sa=D&sntz=1&usg=AFQjCNFR8fDmqOsSdghS2mC-7R90hku9Rw>
>>
>> So you have a few way to get it to work. Use the config as Pascal wrote
>> (the easiest) or try this new PR.
>>
>>
>>
>> On Mon, Oct 3, 2016 at 12:11 PM, Pascal Lamblin <lamb...@iro.umontreal.ca
>> > wrote:
>>
>>> On Mon, Oct 03, 2016, luca.wag...@gmail.com wrote:
>>> > floatX = float32
>>> > device=cuda0
>>> > dnn.conv.algo_fwd =  time_once
>>> > dnn.conv.algo_bwd_filter = time_once
>>> > dnn.conv.algo_bwd_data = time_once
>>>
>>> In the .theanorc, you have to use sections, for instance:
>>>
>>> [dnn.conv]
>>> algo_fwd =  time_once
>>> algo_bwd_filter = time_once
>>> algo_bwd_data = time_once
>>>
>>> >
>>> > Using theano.gpuarray.dnn.dnn_conv  the output is: ValueError:
>>> > ("convolution algo %s can't be used for 3d convolutions", ('small',))
>>> > Same output with float16.
>>> >
>>> >
>>> > If I use  theano.sandbox.cuda.dnn.dnn_conv3d  with Theano flags
>>> > floatX = float16
>>> > device=cuda0
>>> > dnn.conv.algo_fwd =  time_once
>>> > dnn.conv.algo_bwd_filter = time_once
>>> > dnn.conv.algo_bwd_data = time_once
>>> >
>>> > the output is: TypeError: CudaNdarrayType only supports dtype float32
>>> for
>>> > now. Tried using dtype float16 for variable None
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> Groups "theano-users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> an email to theano-users...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>>> Pascal
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to theano-users...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Error using floatX = float16 to save memory

Reply via email to