On Wed, Mar 7, 2012 at 5:51 AM, Gael Varoquaux <[email protected]> wrote:
> In such situation, I believe that a cyclic coordinate descent with a > clever way of choosing the coordinates is the fastest approach. In some > sens it is the transpose of the SGD (hand-wavingly). For n_features > n_samples, I believe that coordinate descent is faster in the dual. A primal coordinate descent needs to optimize one w_i at a time. Therefore, if your data is high dimensional it can take time. Liblinear implements shrinking to avoid revisiting some coordinates. Maybe the greedy selection of coordinates you mention can also help. But then, can it be called cyclic? > I would indeed like to see a fast coordinate descent solver for logistic > regression. I am more interested in the l1 penalty, but the l2 penalty is > also useful. Multinomial loss could fall in such work. > > For such contribution to be actually useful, I'd like the code to be > really fast with large n_features: we don't need a solver that doesn't > scale to real problem. I am not an expert, but I think that a reference > that I recently mentionned could be useful: > > http://www.jmlr.org/papers/volume11/yuan10c/yuan10c.pdf Note that the coordinate decent newton (CDN) algorithms for Logistic Regression and L2-SVM mentioned in that paper are already in liblinear and hence in scikit-learn :) Coordinate-descent in the primal works better in CSC format. This representation is not natural for some algorithms. Therefore, if you work in a Pipeline of transformations, you may need to hold the dataset twice in memory at some point (in CSR and in CSC). Therefore, it could still be nice to implement algorithms which are a good fit for CSR format. Mathieu ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
