On 12/06/2011 04:55 AM, Gael Varoquaux wrote: > On Mon, Dec 05, 2011 at 10:54:42PM +0100, Olivier Grisel wrote: >> - libsvm uses SMO (a dual solver) and supports non-linear kernels and >> has complexity ~ n_samples^3 hence cannot scale to large n_samples >> (e.g. more than 50k). >> - liblinear uses some kind of fancy coordinate descent (primal or dual >> solvers) optimized for regularized linear models, provides more >> regularization / loss function options such as l1 penalty and can >> scale to large n_samples (as long as the sparse internal >> representation of the data fits in memory). >>> By the way, I suggest someone update the documentation to specify what >>> the consequences of using the different SVM classes are. Currently >>> LinearSVC is recommend "for huge datasets", not "for huge sparse >>> datasets." That is on >>> this page: >>> http://scikit-learn.sourceforge.net/dev/modules/generated/sklearn.svm.LinearSVC.html >> For huge dense data, the only viable option is SGDClassifier on memory >> mapped arrays (double precision). > The full content of the above paragraphs should be pasted in the docs > (with a little bit of rewording). > +1
------------------------------------------------------------------------------ Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
