Yes, I was about to answer the same thing: SGD is great when n_samples >
n_features, but the situation n_samples << n_features also exists.

In such situation, I believe that a cyclic coordinate descent with a
clever way of choosing the coordinates is the fastest approach. In some
sens it is the transpose of the SGD (hand-wavingly).

I would indeed like to see a fast coordinate descent solver for logistic
regression. I am more interested in the l1 penalty, but the l2 penalty is
also useful. Multinomial loss could fall in such work.

For such contribution to be actually useful, I'd like the code to be
really fast with large n_features: we don't need a solver that doesn't
scale to real problem. I am not an expert, but I think that a reference
that I recently mentionned could be useful:

http://www.jmlr.org/papers/volume11/yuan10c/yuan10c.pdf

Obviously doing this right is quite a lot of work. I think that my group
could invest  some efforts in this direction. We were starting to discuss
this a bit.

Gaƫl

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to