Hi all,

I have been playing a bit with GradientBoostingClassifier and
AdaBoostClassifier and ExtraTrees and while extra trees and adaboost
are reasonably fast to fit with there default params (n_estimators=10)
on a non toy dataset such as the olivetti faces dataset, the
GradientBoostingClassifier was taking ages (I killed it).

The current default value is n_estimators=100 for
GradientBoostingClassifier. Maybe it should be aligned to
n_estimators=10 as in the other ensemble methods of the scikit?

Or was I doing something very stupid by naively running it with the
default params on a dataset with size n_samples=400, n_features=4096
and n_classes=40 without any kind of preprocessing?

Another way to rephrase that question: what is the typical sweet spot
for the dataset shape when doing classification Gradient Boosted
Trees? What are reasonable values for the number of estimators in
various application domains?

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to