Typically, the parameters of your model (W) won't have a batch axis or indices. For L_2 regularization on weight matrices, people generally use alpha * T.sum(W ** 2) where alpha is a scalar weighting of the cost.
On Sunday, August 28, 2016 at 11:15:12 PM UTC-7, shashank gupta wrote: > > Hi All, > > I have a small doubt. If a model is using some part of a parameter in the > cost then when we add regularization cost (say L2 norm) of a parameter in > the cost function, should we add L2 norm of whole parameter matrix, or > should we add L2 norm of the sub_tensor? > > More precisely which of the following cost is a correct one? > > W = T.matrix() > > cost = f(W[batch_indices, :]) + T.sum(W ** 2) > > OR > > cost = f(W[batch_indices, :]) + T.sum(W[batch_indices, :]) > > Please let me know which of the above cost declaration is correct? > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
