On Wed, Mar 7, 2012 at 5:51 AM, Gael Varoquaux
<[email protected]> wrote:

> In such situation, I believe that a cyclic coordinate descent with a
> clever way of choosing the coordinates is the fastest approach. In some
> sens it is the transpose of the SGD (hand-wavingly).

For n_features > n_samples, I believe that coordinate descent is
faster in the dual. A primal coordinate descent needs to optimize one
w_i at a time. Therefore, if your data is high dimensional it can take
time. Liblinear implements shrinking to avoid revisiting some
coordinates. Maybe the greedy selection of coordinates you mention can
also help. But then, can it be called cyclic?

> I would indeed like to see a fast coordinate descent solver for logistic
> regression. I am more interested in the l1 penalty, but the l2 penalty is
> also useful. Multinomial loss could fall in such work.
>
> For such contribution to be actually useful, I'd like the code to be
> really fast with large n_features: we don't need a solver that doesn't
> scale to real problem. I am not an expert, but I think that a reference
> that I recently mentionned could be useful:
>
> http://www.jmlr.org/papers/volume11/yuan10c/yuan10c.pdf

Note that the coordinate decent newton (CDN) algorithms for Logistic
Regression and L2-SVM mentioned in that paper are already in liblinear
and hence in scikit-learn :)
Coordinate-descent in the primal works better in CSC format. This
representation is not natural for some algorithms. Therefore, if you
work in a Pipeline of transformations, you may need to hold the
dataset twice in memory at some point (in CSR and in CSC). Therefore,
it could still be nice to implement algorithms which are a good fit
for CSR format.

Mathieu

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to