Re: [theano-users] Re: How to obtain each weight update of BPTT gradient at time t

Pascal Lamblin Fri, 07 Oct 2016 14:48:48 -0700

Scan can perform the accumulation of partial gradients at each timestep,
so it does not keep track of each of the partial gradients (even though
you may have access to a cumulative sum by digging through its output
buffers).


The simplest way would be to define a dummy sequence z full of zeros,
of shape (n_steps, *W.shape). Then, each time you use W in the step
function, use (W + z_t). That way, if you get the gradient of your cost
wrt z, it should be equal to the gradient of the cost wrt W at each
timestep.

On Fri, Oct 07, 2016, John Moore wrote:
> Somewhat equivalently, how could I take each of the gradient updates 
> instead of scan just summing all gradient updates automatically?
> 
> On Friday, October 7, 2016 at 4:16:46 PM UTC-4, John Moore wrote:
> >
> > Hi All, 
> >
> > My understanding of BPTT is to unfold the network, take the gradients 
> > through time, then average the weight updates.
> > How do I obtain the weight updates at each timestep? I know that scan 
> > automatically performs BPTT for you, so that it gives you only one weight 
> > update. 
> >
> > Any insight appreciated.
> >
> > Thanks,
> > John
> >
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.


-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Re: How to obtain each weight update of BPTT gradient at time t

Reply via email to