[theano-users] Re: Regularizing whole parameter or its subtensor?

Jesse Livezey Mon, 29 Aug 2016 17:59:08 -0700

Typically, the parameters of your model (W) won't have a batch axis or 
indices. For L_2 regularization on weight matrices, people generally use 
alpha * T.sum(W ** 2) where alpha is a scalar weighting of the cost.


On Sunday, August 28, 2016 at 11:15:12 PM UTC-7, shashank gupta wrote:
>
> Hi All,
>
> I have a small doubt.  If a model is using some part of a parameter in the 
> cost then when we add regularization cost (say L2 norm) of a parameter in 
> the cost function, should we add L2 norm of whole parameter matrix, or 
> should we add L2 norm of the sub_tensor?
>
> More precisely which of the following cost is a correct one?
>
> W = T.matrix()
>
> cost = f(W[batch_indices, :]) + T.sum(W ** 2) 
>
>            OR 
>
> cost = f(W[batch_indices, :]) + T.sum(W[batch_indices, :])
>
> Please let me know which of the above cost declaration is correct? 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] Re: Regularizing whole parameter or its subtensor?

Reply via email to