There can be many small differences that can have a big impact.
You could try starting both models with exactly the same values for
initial parameters, and present the examples in the same order.
Then, you can compare the outputs of the untrained models and see if
there is a mismatch in the fprop itself.
Then, maybe try with a simpler learning rule and compare the updates
on the first example.
On Sat, Oct 15, 2016, wondaround wrote:
> I wanted to start using Theano because of the numerous positive reviews I
> read around. To make my life easier, I decided to start with some wrappers,
> specifically Keras. I find Keras a very useful and well done tool. It is
> perfect to start using Theano and it is really easily understandable and
> Now, I created a Keras model (using Theano interface), which works
> perfectly well and I would like to replicate it using only Theano code.
> Since Keras is actually using Theano code, I should be able, in principle,
> to do this.
> The neural net is a convolutional neural network for a one output
> regression task, with the following layers: conv2d, maxpool2d, conv2d,
> maxpool2d, dense, dense, output and using Adam optimizer.
> Unfortunately, despite it seems to me that I implemented exactly the same
> neural network with vanilla Theano code, the performance is consistently
> So, I guess I must be wrong somewhere, but I can't see where.
> I will put here a link to the codes I'm using, in order to make the post
> too long. They are short and simple codes, do not worry =)
> Keras Impl <http://pastebin.com/7eNubwxw>
> Theano main <http://pastebin.com/Lvdn6UAc>
> Dense layer with MSE loss function <http://pastebin.com/RyUH07Te>
> Conv layer + max pooling <http://pastebin.com/VVmXm1Uk>
> Updates rule <http://pastebin.com/fp5Draq7>
> Keep in mind that the Theano code is mostly adapted from the Theano
> tutorial found on the website, and the update rules for Adam optimizer is
> adapted from Keras source code.
> I have a large training set, so I usually check after a few epochs the
> behaviour of the code and this is what I see:
> Keras Model: the validation error keeps decreasing, and already after two
> or three epochs I see a very good match between prediction and true values
> (points quite close around the bisector in the plot at the bottom of the
> Vanilla Theano Impl: the validation decreases at first, but then some kind
> of oscillating/overfitting features appear (and the absolute value is 10
> times higher than in the Keras impl.), and the match between prediction and
> true values is worse (points quite spread around the bisector in the plot
> at the bottom of the code)
> So, is someone able to tell me where is the difference between the Keras
> model and my Theano implementation? Since I'm new to the field, I would
> really like to understand what I'm doing wrong that so strongly affects the
> performance of the network.
> Any help would be appreciated. Thanks
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.