There isn't a provided code that does that. We recently implemented scan checkpoint that give a thread off between speed and memory usage:
http://deeplearning.net/software/theano_versions/dev/library/scan.html?highlight=scan_checkpoint#theano.scan_checkpoints Does that fix what you need, to support very long sequences? The difference si that the updates are done after the forward pass. On Tue, Dec 6, 2016 at 2:17 AM, Hosang Yoon <[email protected]> wrote: > In case truncate_gradient > n_steps is not permitted (or does not work as > intended above), can this be a possible solution? > > - Start at time *t* > - Forward propagate 2*h* steps to time (*t*+2*h*) > - Calc gradient & parameter update using inputs and states in interval *t* > to (*t*+2*h*) > - With the updated parameters, do a reverse direction scan from (*t*+2*h*) > to (*t*+*h*) (using go_backwards flag in scan) > - Repeat above until the end of sequence > > So we would be wasting some calculation (going back *h*, then going > forward 2*h* with the same parameters in the next iteration), but would > this get the job done? > > > On Tuesday, December 6, 2016 at 9:50:55 AM UTC+9, Hosang Yoon wrote: >> >> Hello, >> >> I'm trying to see if I can implement BPTT(2*h*, *h*) as defined in >> Williams and Peng (1990; doi:10.1162/neco.1990.2.4.490) using theano.scan. >> >> It will be used for a long sequence of thousands of steps, where >> calculating gradient only once at the end of the sequence (which is the way >> BPTT is usually implemented in existing Theano codes I can find on the web) >> would not be very feasible. >> >> Instead, BPTT(2*h*, *h*) would involve, starting at time *t*: >> >> - Forward propagate *h* steps to time (*t* + *h*) >> - Calculate gradient by looking back 2*h* steps, using inputs and states >> in interval (*t* - *h*) to (*t* + *h*) >> - Update parameters, then repeat until the end of sequence >> >> My question is: >> >> - Is it possible to pass truncate_gradient = 2 * n_steps to a theano.scan >> loop, and will it produce the behavior described above? >> >> I glanced at the relevant parts in scan_op.py in Theano source code, but >> my (not very well educated) impression was that it might not be the case >> >> - If that cannot be done, how would one implement such a mechanism using >> theano.scan? >> - Is there a public Theano code that implements this that I can refer to? >> >> Thank you in advance for any input! >> > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
