> 
> Indeed Abhi which section specific section of the documentation (or
> docstring) led you to ask this question?
> 
> The note on this page is pretty explicit:
> 
> http://scikit-learn.org/dev/modules/multiclass.html
> 
> Along with the docstring:
> 
> http://scikit-
learn.org/dev/modules/generated/sklearn.linear_model.LogisticRegression.html#skl
earn.linear_model.LogisticRegression
> 
> Maybe the docstring could be made more consistent and use the
> one-vs-rest notation instance of one-vs-all (which is a synonym).

Ah, sorry I should have been clear, the docstring and the multiclass page are 
indeed clear on what each classifier does, but I was just a bit confused on 
the practical application when, specifically, I will go for a particular 
algorithm rather than the other(when data/number of classes can scale steeply,
 and the number of requests for predicting correct category per hour are very
 large, hence predict/vectorization time critical.) 

For e.g. 
http://scikit-learn.org/dev/modules/multiclass.html#one-vs-the-rest
It is mentioned
"In addition to its computational efficiency (only n_classes classifiers are
 needed), one advantage of this approach is its interpretability. "
and on http://scikit-learn.org/dev/modules/generated/
sklearn.linear_model.LogisticRegression.html
#sklearn.linear_model.LogisticRegression
"In the multiclass case, the training algorithm uses a one-vs.-all (OvA) 
scheme, 
rather than the “true” multinomial LR."
So if OneVsRestClassifier is more computationally efficient and the underlying
 method/algorithm can achieve same rate of success, why not directly use that 
scheme. 

> 
> >> 2) Isnt SGDClassifier(loss='log') better than LogisticRegression for large
> >> sparse datasets? If so, why?
> >
> > It's faster to train *once* you chose the learning rate, which is usually a
> > pain. You can also try LogisticRegression(tol=1e-2) or
> > LogisticRegression(tol=1e-1).
> 
> Actually the default learning rate schedule of scikit-learn kind of
> always work but you have to adjust `n_iter` which is an additional
> parameter w.r.t. LogisticRegression.
> 
> Also SGDClassifier can spare a dataset memory copy if your data is can
> be natively loaded as a scipy Compressed Sparse Rows matrix. Also if
> the data does not fit in memory you can load it as CSR chunks (e.g.
> from a set of svmlight files on the filesystem or database or
> vectorized on the fly from text content using a pre-fitted text
> vectorizer) and the model can be incrementally learned using
> sequential calls to the partial_fit method.
> 
Later on I might need to split the large dataset so partial fit might be a good
option in case of mem issues. Presently I am good on memory
since I use generators to get the data, the max mem usage for the entire 
training run turns out around 10-15g. Thanks for clearing that though.

Er, since I am rather new to scikit and machine learning in general I apologize 
for the the simple questions.



------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to