On Thu, Oct 13, 2016, Kv Manohar wrote: > The problem is with grads, gradients corresponding to 'W1' are turning out > to be zero! > But why are they turning out to be zero?

## Advertising

Again, my first guess would be because the first hidden units are saturating. You can test that by checking the values of hidden_activations. Another thing you can do to check what happens during the backpropagation is to check the value of the gradient wrt hidden_activations and other intermediate variables. For instance: hidden_preactivations = T.dot(x, W1) hidden_activations = T.nnet.sigmoid(hidden_preactivations) prob_y_given_x = ... # like before and then monitor theano.grad(cost, [hidden_activations, hidden_preactivations]). > > On Wednesday, October 12, 2016 at 4:14:43 PM UTC+5:30, Kv Manohar wrote: > > > > *Initial Variables* > > *x = T.dmatrix('x')* > > *y = T.dmatrix('y')* > > > > *These are the weights of a neural network* > > *W1_vals = np.asarray(rng.randn(input, hidden), > > dtype=theano.config.floatX)* > > *W1 = shared(value=W1_vals, name='W1')* > > *W2_vals = np.asarray(rng.randn(hidden, output), > > dtype=theano.config.floatX)* > > > > *W2 = shared(value=W2_vals, name='W2')* > > > > *Cost function is:* > > hidden_activations = T.nnet.sigmoid(T.dot(x, W1)) > > prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2)) > > > > #y is one-hot vectors > > *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))* > > *params = [W1, W2]* > > > > *Corresponding gradients are computed as* > > *grads = T.grad(cost, params)* > > > > *Updates rule is* > > lr = 0.01 > > updates = [(param, param-lr*grad) for param, grad in zip(params, grads)] > > > > *Function to train the model* > > *train = function(inputs=[x, y], outputs=cost, updates=updates)* > > > > *The problem I'm facing* > > *I'm updating the weights after one full sweep of training data (50 > > examples),* > > *When I print out the values of W1 and W2 after each iteration(using > > W1.get_value() etc), W2 seems to get updated but not W1* > > *Values of W1 are constant through out.* > > *Where is the mistake in my code?* > > *I'm unable to figure it out* > > *Thanks!* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to theano-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- Pascal -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.