On Thu, Oct 13, 2016, Kv Manohar wrote:
> The problem is with grads, gradients corresponding to 'W1' are turning out 
> to be zero!
> But why are they turning out to be zero?

Again, my first guess would be because the first hidden units are saturating.
You can test that by checking the values of hidden_activations.

Another thing you can do to check what happens during the
backpropagation is to check the value of the gradient wrt
hidden_activations and other intermediate variables.

For instance:
hidden_preactivations = T.dot(x, W1)
hidden_activations = T.nnet.sigmoid(hidden_preactivations)
prob_y_given_x = ...  # like before

and then monitor theano.grad(cost, [hidden_activations, hidden_preactivations]).



> 
> On Wednesday, October 12, 2016 at 4:14:43 PM UTC+5:30, Kv Manohar wrote:
> >
> > *Initial Variables*
> > *x = T.dmatrix('x')*
> > *y = T.dmatrix('y')*
> >
> > *These are the weights of a neural network*
> > *W1_vals = np.asarray(rng.randn(input, hidden), 
> > dtype=theano.config.floatX)*
> > *W1 = shared(value=W1_vals, name='W1')*
> > *W2_vals = np.asarray(rng.randn(hidden, output), 
> > dtype=theano.config.floatX)*
> >
> > *W2 = shared(value=W2_vals, name='W2')*
> >
> > *Cost function is:*
> > hidden_activations = T.nnet.sigmoid(T.dot(x, W1))
> > prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2))
> >
> > #y is one-hot vectors
> > *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))*
> > *params = [W1, W2]*
> >
> > *Corresponding gradients are computed as*
> > *grads = T.grad(cost, params)*
> >
> > *Updates rule is*
> > lr = 0.01
> > updates = [(param, param-lr*grad) for param, grad in zip(params, grads)]
> >
> > *Function to train the model*
> > *train = function(inputs=[x, y], outputs=cost, updates=updates)*
> >
> > *The problem I'm facing*
> > *I'm updating the weights after one full sweep of training data (50 
> > examples),*
> > *When I print out the values of W1 and W2 after each iteration(using 
> > W1.get_value() etc), W2 seems to get updated but not W1*
> > *Values of W1 are constant through out.*
> > *Where is the mistake in my code?*
> > *I'm unable to figure it out*
> > *Thanks!*
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to