The problem is with grads, gradients corresponding to 'W1' are turning out 
to be zero!
But why are they turning out to be zero?

On Wednesday, October 12, 2016 at 4:14:43 PM UTC+5:30, Kv Manohar wrote:
>
> *Initial Variables*
> *x = T.dmatrix('x')*
> *y = T.dmatrix('y')*
>
> *These are the weights of a neural network*
> *W1_vals = np.asarray(rng.randn(input, hidden), 
> dtype=theano.config.floatX)*
> *W1 = shared(value=W1_vals, name='W1')*
> *W2_vals = np.asarray(rng.randn(hidden, output), 
> dtype=theano.config.floatX)*
>
> *W2 = shared(value=W2_vals, name='W2')*
>
> *Cost function is:*
> hidden_activations = T.nnet.sigmoid(T.dot(x, W1))
> prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2))
>
> #y is one-hot vectors
> *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))*
> *params = [W1, W2]*
>
> *Corresponding gradients are computed as*
> *grads = T.grad(cost, params)*
>
> *Updates rule is*
> lr = 0.01
> updates = [(param, param-lr*grad) for param, grad in zip(params, grads)]
>
> *Function to train the model*
> *train = function(inputs=[x, y], outputs=cost, updates=updates)*
>
> *The problem I'm facing*
> *I'm updating the weights after one full sweep of training data (50 
> examples),*
> *When I print out the values of W1 and W2 after each iteration(using 
> W1.get_value() etc), W2 seems to get updated but not W1*
> *Values of W1 are constant through out.*
> *Where is the mistake in my code?*
> *I'm unable to figure it out*
> *Thanks!*
>
>
>
>
>
>
>
>
>
>
>
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to