Hi, The L2 update rule is derived from the derivative of the loss function with respect to the model weights - an L2 regularized loss function contains an additional additive term involving the weights. This paper provides some useful mathematical background: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.7377
The code that computes the new L2 weight is here: https://github.com/apache/incubator-spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/Updater.scala#L90 The compute function calculates the new weights based on the current weights gradient as computed at each step. Contrast it with the code in the SimpleUpdater class to get a sense for how the regularization parameter is incorporated - it's fairly simple. In general, though, I agree it makes sense to include a discussion of the algorithm and a reference to the specific version we implement in the scaladoc. - Evan On Thu, Jan 9, 2014 at 10:49 AM, Walrus theCat <[email protected]>wrote: > No -- I'm not, and I appreciate the comment. What I'm looking for is a > specific mathematical formula that I can map to the source code. > > Personally, specifically, I'd like to see how the loss function gets > embedded into the w (gradient), in the case of the regularized and > unregularized operation. > > Looking through the source, the "loss history" makes sense to me, but I > can't see how that translates into the effect on the gradient. > > > On Thu, Jan 9, 2014 at 10:39 AM, Sean Owen <[email protected]> wrote: > >> L2 regularization just means "regularizing by penalizing parameters >> whose L2 norm is large", and L2 norm just means squared length. It's >> not something you would write an ML paper on any more than what the >> vector dot product is. Are you asking something else? >> >> On Thu, Jan 9, 2014 at 6:19 PM, Walrus theCat <[email protected]> >> wrote: >> > Thanks Christopher, >> > >> > I wanted to know if there was a specific paper this particular codebase >> was >> > based on. For instance, Weka cites papers in their documentation. >> > >> > >> > On Wed, Jan 8, 2014 at 7:10 PM, Christopher Nguyen <[email protected]> >> wrote: >> >> >> >> Walrus, given the question, this may be a good place for you to start. >> >> There's some good discussion there as well as links to papers. >> >> >> >> >> >> >> http://www.quora.com/Machine-Learning/What-is-the-difference-between-L1-and-L2-regularization >> >> >> >> Sent while mobile. Pls excuse typos etc. >> >> >> >> On Jan 8, 2014 2:24 PM, "Walrus theCat" <[email protected]> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Can someone point me to the paper that algorithm is based on? >> >>> >> >>> Thanks >> > >> > >> > >
