[theano-users] New to theano. Trying to add a term to a loss function to penalize large weights

Gregory N Thu, 05 Jan 2017 11:56:39 -0800

To be clear, by weights I mean the entries in the matrices (Ws) of the 
affine transformation in a note.


I start with categorical_cross_entropy  
<http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html#theano.tensor.nnet.nnet.categorical_crossentropy>as
 
my loss function. And I want to add an additional term to penalize large 
weights. 
To this end I want to introduce a term of the form 
"theano.tensor.sum(theano.tensor.exp(-10 
* ws))" . Where  "ws" are the weights. 

If I follow the source code of  "categorical_cross_entropy 
<http://lasagne.readthedocs.io/en/latest/modules/objectives.html#lasagne.objectives.categorical_crossentropy>"
 
(categorical_crossentropy(coding_dist, true_dist):)


if true_dist.ndim == coding_dist.ndim: return -tensor.sum(true_dist * 
tensor.log(coding_dist),
axis=coding_dist.ndim - 1)
elif true_dist.ndim == coding_dist.ndim - 1:
return crossentropy_categorical_1hot(coding_dist, true_dist)
else:
raise TypeError('rank mismatch between coding and true distributions')


Seems like I should update  the third line (from the bottom) to read

crossentropy_categorical_1hot(coding_dist, true_dist) + 
theano.tensor.sum(theano.tensor.exp(- 10 * ws))



And change the declaration of the  function to be
to categorical_crossentropy(coding_dist, true_dist, ws): Where in calling 
for categorical_crossentropy I write 

loss = my_categorical_crossentropy(net_output, true_output, l_layers[1].W) 

with, for a start, l_layers[1].W to be the weights coming from the first 
layer of my NN. 

With those updates, I go on writing: 

loss = aggregate(loss, mode = 'mean') 
updates = sgd(loss, all_params, learning_rate = 0.005) 
train = theano.function([l_input.input_var, true_output], loss, updates = 
updates)
[...]

This passes the compiler and the training of the network completes, but for 
some reason the additional term " theano.tensor.sum(theano.tensor.exp(- 10 
* ws))"  is not updated, it does not effect the loss value. 

So what am I doing wrong, I was trying to look into Theano documentation, 
so far I could not figure out what might be wrong? 


Any comments are welcome. Thanks!

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] New to theano. Trying to add a term to a loss function to penalize large weights

Reply via email to