On Thu, Nov 17, 2011 at 07:29, Mathieu Blondel <[email protected]> wrote:
> In most SGD papers I know, people do:
>
> 1) Sample instance x_i
> 2) Predict label for x_i
> 3) Regularize weight
> 4) Update weight if non-zero loss suffered
>
> However, J. Langford and B. Carpenter do:
>
> 1) Sample instance x_i
> 2) Regularize weight
> 3) Predict label for x_i
> 4) Update weight if non-zero loss suffered
>
> Regularization doesn't depend on the prediction and regularization may
> change the prediction so I guess it makes senses to do it like J.
> Langford and B. Carpenter but, do people have feedback about which one
> is usually empirically better?

What do you mean by regularize weight here? Do an L1 truncation?
-- 
 - Alexandre

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to