I had normalized the input so that all the values were between 0 and 1.
There is no luck with using either ReLU activation unit or tanh nor by
multiplying the initial weights with factor of 0.01.
There seems to be some other problem with my implementation which I'm
unable to figure out.

On Wednesday, October 12, 2016 at 4:14:43 PM UTC+5:30, Kv Manohar wrote:
>
> *Initial Variables*
> *x = T.dmatrix('x')*
> *y = T.dmatrix('y')*
>
> *These are the weights of a neural network*
> *W1_vals = np.asarray(rng.randn(input, hidden),
> dtype=theano.config.floatX)*
> *W1 = shared(value=W1_vals, name='W1')*
> *W2_vals = np.asarray(rng.randn(hidden, output),
> dtype=theano.config.floatX)*
>
> *W2 = shared(value=W2_vals, name='W2')*
>
> *Cost function is:*
> hidden_activations = T.nnet.sigmoid(T.dot(x, W1))
> prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2))
>
> #y is one-hot vectors
> *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))*
> *params = [W1, W2]*
>
> *Corresponding gradients are computed as*
> *grads = T.grad(cost, params)*
>
> *Updates rule is*
> lr = 0.01
> updates = [(param, param-lr*grad) for param, grad in zip(params, grads)]
>
> *Function to train the model*
> *train = function(inputs=[x, y], outputs=cost, updates=updates)*
>
> *The problem I'm facing*
> *I'm updating the weights after one full sweep of training data (50
> examples),*
> *When I print out the values of W1 and W2 after each iteration(using
> W1.get_value() etc), W2 seems to get updated but not W1*
> *Values of W1 are constant through out.*
> *Where is the mistake in my code?*
> *I'm unable to figure it out*
> *Thanks!*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
