Re: [theano-users] lazy evaluation of a convnet with ifelse

2016-12-06 Thread Pascal Lamblin
It is also possible that the gradient through the non-executed branch
tries to explicitly backpropagate zeros through the convolution.

In that case, an option would be to have an optimization replacing
ConvGradI(zeros(...), w) by zeros(right_shape).

On Tue, Dec 06, 2016, Frédéric Bastien wrote:
> Hi,
> 
> using ifelse do not make sure computation don't get executed. There is
> interaction with another optimization that sometime trigger the execution
> of some/all node in the not executed branch. This happen in particular
> during training.
> 
> I think that you you disable inplace optimization, it should fix that. Can
> you try that? Just use this Theano flag:
> 
> optimizer_excluding=inplace
> 
> For the python execution, this only add an extra overhead. If you don't see
> it in the Theano profiler output, you don't need to take care of this.
> Otherwise, we need to make a C interface for the lazy op, as we don't have
> one now.
> 
> Fred
> 
> On Sun, Dec 4, 2016 at 1:51 AM, Emmanuel Bengio  wrote:
> 
> > Hi everyone,
> >
> > I'm trying to train a deep convnet with residual connections, where some
> > layers are lazily evaluated (and the residual is always forward-propagated).
> >
> > When only doing the forward pass, simply wrapping each layer's output as:
> >   ifelse(cond, conv(x,W) + x, x)
> > works.
> >
> > When doing the backwards pass, things get trickier. I need to wrap the
> > update of W in an ifelse as well:
> >   updates += [(W, ifelse(cond, W - lr*T.grad(W), W)]
> >
> > but it seems simply doing this is not enough. If I look at the
> > profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W
> > being exectued.
> >
> > If I do the following for the layer output:
> >   ifelse(cond, ifelse(cond, conv(x,W), x) + x, x)
> > now only the right number of DnnConv and ConvGradW are executed.
> > Having taken I peek at ifelse.py, I suspect that is necessary because the
> > optimization that lifts the ifelse is both unaware of convolutions and,
> > most importantly, not activated.
> >
> > Somehow though, all the ConvGradI ops are still being executed.
> > I am basing this on some "minimal" code I made: https://gist.github.com/
> > bengioe/edf82104a391bf54bb8776d8b211e87c
> > With all the ifelse's I'm using this results in:
> > If eval
> > (, 8)
> > (, 16)
> > (, 7)
> > (, 8)
> > If no eval
> > (, 2)
> > (, 10)
> > (, 7)
> > (, 2)
> >
> > There are 6 lazy layers, so this shows the correct number of DnnConv and
> > ConvGradW being run, but the wrong number of ConvGradI.
> >
> > I'm wondering at this point if there's anything to do without delving deep
> > into the ifelse/lazy evaluation code. I'm really not sure what is going on.
> > Please let me know if you have any ideas or suggestions.
> >
> > Thanks!
> >
> > ps: Aside thought: Since IfElse is still in Python, and I am doing lots of
> > call to it, I'm afraid it might slow down computation. I was wondering if
> > there would be a totally different way of doing what I'm trying to do maybe
> > with a radically different computational structure?
> > --
> > Emmanuel Bengio
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google Groups
> > "theano-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to theano-users+unsubscr...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
> >
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [theano-users] lazy evaluation of a convnet with ifelse

2016-12-05 Thread Frédéric Bastien
Hi,

using ifelse do not make sure computation don't get executed. There is
interaction with another optimization that sometime trigger the execution
of some/all node in the not executed branch. This happen in particular
during training.

I think that you you disable inplace optimization, it should fix that. Can
you try that? Just use this Theano flag:

optimizer_excluding=inplace

For the python execution, this only add an extra overhead. If you don't see
it in the Theano profiler output, you don't need to take care of this.
Otherwise, we need to make a C interface for the lazy op, as we don't have
one now.

Fred

On Sun, Dec 4, 2016 at 1:51 AM, Emmanuel Bengio  wrote:

> Hi everyone,
>
> I'm trying to train a deep convnet with residual connections, where some
> layers are lazily evaluated (and the residual is always forward-propagated).
>
> When only doing the forward pass, simply wrapping each layer's output as:
>   ifelse(cond, conv(x,W) + x, x)
> works.
>
> When doing the backwards pass, things get trickier. I need to wrap the
> update of W in an ifelse as well:
>   updates += [(W, ifelse(cond, W - lr*T.grad(W), W)]
>
> but it seems simply doing this is not enough. If I look at the
> profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W
> being exectued.
>
> If I do the following for the layer output:
>   ifelse(cond, ifelse(cond, conv(x,W), x) + x, x)
> now only the right number of DnnConv and ConvGradW are executed.
> Having taken I peek at ifelse.py, I suspect that is necessary because the
> optimization that lifts the ifelse is both unaware of convolutions and,
> most importantly, not activated.
>
> Somehow though, all the ConvGradI ops are still being executed.
> I am basing this on some "minimal" code I made: https://gist.github.com/
> bengioe/edf82104a391bf54bb8776d8b211e87c
> With all the ifelse's I'm using this results in:
> If eval
> (, 8)
> (, 16)
> (, 7)
> (, 8)
> If no eval
> (, 2)
> (, 10)
> (, 7)
> (, 2)
>
> There are 6 lazy layers, so this shows the correct number of DnnConv and
> ConvGradW being run, but the wrong number of ConvGradI.
>
> I'm wondering at this point if there's anything to do without delving deep
> into the ifelse/lazy evaluation code. I'm really not sure what is going on.
> Please let me know if you have any ideas or suggestions.
>
> Thanks!
>
> ps: Aside thought: Since IfElse is still in Python, and I am doing lots of
> call to it, I'm afraid it might slow down computation. I was wondering if
> there would be a totally different way of doing what I'm trying to do maybe
> with a radically different computational structure?
> --
> Emmanuel Bengio
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] lazy evaluation of a convnet with ifelse

2016-12-03 Thread Emmanuel Bengio
Hi everyone,

I'm trying to train a deep convnet with residual connections, where some
layers are lazily evaluated (and the residual is always forward-propagated).

When only doing the forward pass, simply wrapping each layer's output as:
  ifelse(cond, conv(x,W) + x, x)
works.

When doing the backwards pass, things get trickier. I need to wrap the
update of W in an ifelse as well:
  updates += [(W, ifelse(cond, W - lr*T.grad(W), W)]

but it seems simply doing this is not enough. If I look at the
profile.op_callcounts(), I still get too many DnnConvs and ConvGradI/W
being exectued.

If I do the following for the layer output:
  ifelse(cond, ifelse(cond, conv(x,W), x) + x, x)
now only the right number of DnnConv and ConvGradW are executed.
Having taken I peek at ifelse.py, I suspect that is necessary because the
optimization that lifts the ifelse is both unaware of convolutions and,
most importantly, not activated.

Somehow though, all the ConvGradI ops are still being executed.
I am basing this on some "minimal" code I made:
https://gist.github.com/bengioe/edf82104a391bf54bb8776d8b211e87c
With all the ifelse's I'm using this results in:
If eval
(, 8)
(, 16)
(, 7)
(, 8)
If no eval
(, 2)
(, 10)
(, 7)
(, 2)

There are 6 lazy layers, so this shows the correct number of DnnConv and
ConvGradW being run, but the wrong number of ConvGradI.

I'm wondering at this point if there's anything to do without delving deep
into the ifelse/lazy evaluation code. I'm really not sure what is going on.
Please let me know if you have any ideas or suggestions.

Thanks!

ps: Aside thought: Since IfElse is still in Python, and I am doing lots of
call to it, I'm afraid it might slow down computation. I was wondering if
there would be a totally different way of doing what I'm trying to do maybe
with a radically different computational structure?
-- 
Emmanuel Bengio

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.