Hello,
   I have been reading and testing examples around the sklearn documentation and
 am not too clear on few things and would appreciate any help regarding the
 following questions:
1) What would be the advantage of training LogisticRegression vs
OneVsRestClassifier(LogisticRegression()) for multiclass. (I understand
the latter would basically train n_classes classifiers). 
2) Isnt SGDClassifier(loss='log') better than LogisticRegression for large
sparse datasets? If so, why?
3) If I need predict_proba for just the best class match from the multiclass 
classifier, can I use OneVsRestClassifier(SGDClassifier())

I tested on empirical data, but have approximately similar results with 
LogisticRegression, SGDClassifier and LinearSVC. (For now data.shape = (10000, 
400000) ). However in future the data might scale to large number of training 
set and features, so wanted to get clearer idea on which approach to choose.
Thanks,
A


------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to