Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-21 Thread Mathieu Blondel
On Sun, Nov 20, 2011 at 5:52 AM, Olivier Grisel wrote: > Also do you have any hint whether this has an impact on the test error > in practice on your data? I've implemented the naive and lazy implementations of Langford (and made sure that they give the same results) so I will try them on several

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-19 Thread Olivier Grisel
2011/11/19 Mathieu Blondel : > On Wed, Nov 9, 2011 at 4:37 AM, Peter Prettenhofer > wrote: > >> Unfortunately, I'm not that familiar with "SGD-L1 (Clipped + >> Lazy-Update)" either - I just quickly skimmed over a technical report >> of Bob [1]. I agree with your description: it seems to me that th

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-18 Thread Mathieu Blondel
On Wed, Nov 9, 2011 at 4:37 AM, Peter Prettenhofer wrote: > Unfortunately, I'm not that familiar with "SGD-L1 (Clipped + > Lazy-Update)" either - I just quickly skimmed over a technical report > of Bob [1]. I agree with your description: it seems to me that the > major difference is the fact that

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-17 Thread Mathieu Blondel
On Thu, Nov 17, 2011 at 10:03 PM, Alexandre Passos wrote: > What do you mean by regularize weight here? Do an L1 truncation? Yes. For L2-regularization, doing the regularization before or after the prediction doesn't change the sign of the prediction (as L2 regularization just needs to multiply

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-17 Thread Alexandre Passos
On Thu, Nov 17, 2011 at 07:29, Mathieu Blondel wrote: > In most SGD papers I know, people do: > > 1) Sample instance x_i > 2) Predict label for x_i > 3) Regularize weight > 4) Update weight if non-zero loss suffered > > However, J. Langford and B. Carpenter do: > > 1) Sample instance x_i > 2) Regu

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-17 Thread Mathieu Blondel
In most SGD papers I know, people do: 1) Sample instance x_i 2) Predict label for x_i 3) Regularize weight 4) Update weight if non-zero loss suffered However, J. Langford and B. Carpenter do: 1) Sample instance x_i 2) Regularize weight 3) Predict label for x_i 4) Update weight if non-zero loss s

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-10 Thread Mathieu Blondel
On Thu, Nov 10, 2011 at 8:12 PM, Adrien wrote: > For my own needs (projected gradient descent), I quickly implemented it > here: https://gist.github.com/1272551 (I tested it against Duchi's own > Matlab code). I think you implemented the algorithm based on sorting, which has complexity O(n_featu

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-10 Thread Adrien
Le 09/11/2011 14:32, Mathieu Blondel a écrit : > On Wed, Nov 9, 2011 at 4:37 AM, Peter Prettenhofer > wrote: > >> I'm aware of the issue - it seems to me that Bob is right but I can >> hardly tell based on empirical evidence. Truncated gradient is quite a >> crude procedure anyways - Olivier once

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-09 Thread Mathieu Blondel
On Wed, Nov 9, 2011 at 4:37 AM, Peter Prettenhofer wrote: > I'm aware of the issue - it seems to me that Bob is right but I can > hardly tell based on empirical evidence. Truncated gradient is quite a > crude procedure anyways - Olivier once suggested to use a projected > gradient approach instea

Re: [Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-08 Thread Peter Prettenhofer
2011/11/8 Mathieu Blondel : > Hello, > > I was re-reading Tsuruoka's paper, based on which the SGDClassifier > implements L1 regularization and found this interesting post (as > usual?) by Bob Carpenter: > > http://lingpipe-blog.com/2009/09/18/tsuruoka-tsujii-ananiadou-2009-stochastic-gradient-desc

[Scikit-learn-general] Possible bug in SGD with L1 regularization and question

2011-11-08 Thread Mathieu Blondel
Hello, I was re-reading Tsuruoka's paper, based on which the SGDClassifier implements L1 regularization and found this interesting post (as usual?) by Bob Carpenter: http://lingpipe-blog.com/2009/09/18/tsuruoka-tsujii-ananiadou-2009-stochastic-gradient-descent-training-for-l1-regularized-log-line