Re: [theano-users] lazy evaluation of a convnet with ifelse
It is also possible that the gradient through the non-executed branch tries to explicitly backpropagate zeros through the convolution. In that case, an option would be to have an optimization replacing ConvGradI(zeros(...), w) by zeros(right_shape). On Tue, Dec 06, 2016, Frédéric Bastien wrote: > Hi, > > using ifelse do not make sure computation don't get executed. There is > interaction with another optimization that sometime trigger the execution > of some/all node in the not executed branch. This happen in particular > during training. > > I think that you you disable inplace optimization, it should fix that. Can > you try that? Just use this Theano flag: > > optimizer_excluding=inplace > > For the python execution, this only add an extra overhead. If you don't see > it in the Theano profiler output, you don't need to take care of this. > Otherwise, we need to make a C interface for the lazy op, as we don't have > one now. > > Fred > > On Sun, Dec 4, 2016 at 1:51 AM, Emmanuel Bengio wrote: > > > Hi everyone, > > > > I'm trying to train a deep convnet with residual connections, where some > > layers are lazily evaluated (and the residual is always forward-propagated). > > > > When only doing the forward pass, simply wrapping each layer's output as: > > ifelse(cond, conv(x,W) + x, x) > > works. > > > > When doing the backwards pass, things get trickier. I need to wrap the > > update of W in an ifelse as well: > > updates += [(W, ifelse(cond, W - lr*T.grad(W), W)] > > > > but it seems simply doing this is not enough. If I look at the > > profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W > > being exectued. > > > > If I do the following for the layer output: > > ifelse(cond, ifelse(cond, conv(x,W), x) + x, x) > > now only the right number of DnnConv and ConvGradW are executed. > > Having taken I peek at ifelse.py, I suspect that is necessary because the > > optimization that lifts the ifelse is both unaware of convolutions and, > > most importantly, not activated. > > > > Somehow though, all the ConvGradI ops are still being executed. > > I am basing this on some "minimal" code I made: https://gist.github.com/ > > bengioe/edf82104a391bf54bb8776d8b211e87c > > With all the ifelse's I'm using this results in: > > If eval > > (, 8) > > (, 16) > > (, 7) > > (, 8) > > If no eval > > (, 2) > > (, 10) > > (, 7) > > (, 2) > > > > There are 6 lazy layers, so this shows the correct number of DnnConv and > > ConvGradW being run, but the wrong number of ConvGradI. > > > > I'm wondering at this point if there's anything to do without delving deep > > into the ifelse/lazy evaluation code. I'm really not sure what is going on. > > Please let me know if you have any ideas or suggestions. > > > > Thanks! > > > > ps: Aside thought: Since IfElse is still in Python, and I am doing lots of > > call to it, I'm afraid it might slow down computation. I was wondering if > > there would be a totally different way of doing what I'm trying to do maybe > > with a radically different computational structure? > > -- > > Emmanuel Bengio > > > > -- > > > > --- > > You received this message because you are subscribed to the Google Groups > > "theano-users" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to theano-users+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to theano-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- Pascal -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [theano-users] lazy evaluation of a convnet with ifelse
Hi, using ifelse do not make sure computation don't get executed. There is interaction with another optimization that sometime trigger the execution of some/all node in the not executed branch. This happen in particular during training. I think that you you disable inplace optimization, it should fix that. Can you try that? Just use this Theano flag: optimizer_excluding=inplace For the python execution, this only add an extra overhead. If you don't see it in the Theano profiler output, you don't need to take care of this. Otherwise, we need to make a C interface for the lazy op, as we don't have one now. Fred On Sun, Dec 4, 2016 at 1:51 AM, Emmanuel Bengio wrote: > Hi everyone, > > I'm trying to train a deep convnet with residual connections, where some > layers are lazily evaluated (and the residual is always forward-propagated). > > When only doing the forward pass, simply wrapping each layer's output as: > ifelse(cond, conv(x,W) + x, x) > works. > > When doing the backwards pass, things get trickier. I need to wrap the > update of W in an ifelse as well: > updates += [(W, ifelse(cond, W - lr*T.grad(W), W)] > > but it seems simply doing this is not enough. If I look at the > profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W > being exectued. > > If I do the following for the layer output: > ifelse(cond, ifelse(cond, conv(x,W), x) + x, x) > now only the right number of DnnConv and ConvGradW are executed. > Having taken I peek at ifelse.py, I suspect that is necessary because the > optimization that lifts the ifelse is both unaware of convolutions and, > most importantly, not activated. > > Somehow though, all the ConvGradI ops are still being executed. > I am basing this on some "minimal" code I made: https://gist.github.com/ > bengioe/edf82104a391bf54bb8776d8b211e87c > With all the ifelse's I'm using this results in: > If eval > (, 8) > (, 16) > (, 7) > (, 8) > If no eval > (, 2) > (, 10) > (, 7) > (, 2) > > There are 6 lazy layers, so this shows the correct number of DnnConv and > ConvGradW being run, but the wrong number of ConvGradI. > > I'm wondering at this point if there's anything to do without delving deep > into the ifelse/lazy evaluation code. I'm really not sure what is going on. > Please let me know if you have any ideas or suggestions. > > Thanks! > > ps: Aside thought: Since IfElse is still in Python, and I am doing lots of > call to it, I'm afraid it might slow down computation. I was wondering if > there would be a totally different way of doing what I'm trying to do maybe > with a radically different computational structure? > -- > Emmanuel Bengio > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to theano-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[theano-users] lazy evaluation of a convnet with ifelse
Hi everyone, I'm trying to train a deep convnet with residual connections, where some layers are lazily evaluated (and the residual is always forward-propagated). When only doing the forward pass, simply wrapping each layer's output as: ifelse(cond, conv(x,W) + x, x) works. When doing the backwards pass, things get trickier. I need to wrap the update of W in an ifelse as well: updates += [(W, ifelse(cond, W - lr*T.grad(W), W)] but it seems simply doing this is not enough. If I look at the profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W being exectued. If I do the following for the layer output: ifelse(cond, ifelse(cond, conv(x,W), x) + x, x) now only the right number of DnnConv and ConvGradW are executed. Having taken I peek at ifelse.py, I suspect that is necessary because the optimization that lifts the ifelse is both unaware of convolutions and, most importantly, not activated. Somehow though, all the ConvGradI ops are still being executed. I am basing this on some "minimal" code I made: https://gist.github.com/bengioe/edf82104a391bf54bb8776d8b211e87c With all the ifelse's I'm using this results in: If eval (, 8) (, 16) (, 7) (, 8) If no eval (, 2) (, 10) (, 7) (, 2) There are 6 lazy layers, so this shows the correct number of DnnConv and ConvGradW being run, but the wrong number of ConvGradI. I'm wondering at this point if there's anything to do without delving deep into the ifelse/lazy evaluation code. I'm really not sure what is going on. Please let me know if you have any ideas or suggestions. Thanks! ps: Aside thought: Since IfElse is still in Python, and I am doing lots of call to it, I'm afraid it might slow down computation. I was wondering if there would be a totally different way of doing what I'm trying to do maybe with a radically different computational structure? -- Emmanuel Bengio -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.