On Thu, Nov 17, 2011 at 07:29, Mathieu Blondel <[email protected]> wrote: > In most SGD papers I know, people do: > > 1) Sample instance x_i > 2) Predict label for x_i > 3) Regularize weight > 4) Update weight if non-zero loss suffered > > However, J. Langford and B. Carpenter do: > > 1) Sample instance x_i > 2) Regularize weight > 3) Predict label for x_i > 4) Update weight if non-zero loss suffered > > Regularization doesn't depend on the prediction and regularization may > change the prediction so I guess it makes senses to do it like J. > Langford and B. Carpenter but, do people have feedback about which one > is usually empirically better?
What do you mean by regularize weight here? Do an L1 truncation? -- - Alexandre ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
