> > Indeed Abhi which section specific section of the documentation (or > docstring) led you to ask this question? > > The note on this page is pretty explicit: > > http://scikit-learn.org/dev/modules/multiclass.html > > Along with the docstring: > > http://scikit- learn.org/dev/modules/generated/sklearn.linear_model.LogisticRegression.html#skl earn.linear_model.LogisticRegression > > Maybe the docstring could be made more consistent and use the > one-vs-rest notation instance of one-vs-all (which is a synonym).
Ah, sorry I should have been clear, the docstring and the multiclass page are indeed clear on what each classifier does, but I was just a bit confused on the practical application when, specifically, I will go for a particular algorithm rather than the other(when data/number of classes can scale steeply, and the number of requests for predicting correct category per hour are very large, hence predict/vectorization time critical.) For e.g. http://scikit-learn.org/dev/modules/multiclass.html#one-vs-the-rest It is mentioned "In addition to its computational efficiency (only n_classes classifiers are needed), one advantage of this approach is its interpretability. " and on http://scikit-learn.org/dev/modules/generated/ sklearn.linear_model.LogisticRegression.html #sklearn.linear_model.LogisticRegression "In the multiclass case, the training algorithm uses a one-vs.-all (OvA) scheme, rather than the “true” multinomial LR." So if OneVsRestClassifier is more computationally efficient and the underlying method/algorithm can achieve same rate of success, why not directly use that scheme. > > >> 2) Isnt SGDClassifier(loss='log') better than LogisticRegression for large > >> sparse datasets? If so, why? > > > > It's faster to train *once* you chose the learning rate, which is usually a > > pain. You can also try LogisticRegression(tol=1e-2) or > > LogisticRegression(tol=1e-1). > > Actually the default learning rate schedule of scikit-learn kind of > always work but you have to adjust `n_iter` which is an additional > parameter w.r.t. LogisticRegression. > > Also SGDClassifier can spare a dataset memory copy if your data is can > be natively loaded as a scipy Compressed Sparse Rows matrix. Also if > the data does not fit in memory you can load it as CSR chunks (e.g. > from a set of svmlight files on the filesystem or database or > vectorized on the fly from text content using a pre-fitted text > vectorizer) and the model can be incrementally learned using > sequential calls to the partial_fit method. > Later on I might need to split the large dataset so partial fit might be a good option in case of mem issues. Presently I am good on memory since I use generators to get the data, the max mem usage for the entire training run turns out around 10-15g. Thanks for clearing that though. Er, since I am rather new to scikit and machine learning in general I apologize for the the simple questions. ------------------------------------------------------------------------------ LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
