Hi,

I train for an epoch containing 30K samples with batches of 64 and I divide
the time spent to number of updates. So the first call to Theano function
should be smoothed out when we average, am I wrong?

I re-did the test yesterday, the timings are pretty equivalent with old/new
backends but the new one is definitely not "faster".

The code is at:
https://github.com/lium-lst/nmtpy/blob/master/nmtpy/layers.py

You can search for theano.scan inside. Basically we have 2 gru_layer's for
source encoder and 1 gru_cond_layer in decoder. gru_cond_layer is actually
2-GRU's intertwined with some complex interactions.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to