Hi Olivier, The higher the number of estimators, the better. The more random the trees (e.g., the lower max_features), the more important it usually is to have a large forest to decrease the variance. To me, 10 is actually a very low default value. In my daily research, I deal with hundreds of trees. But yeah, it also takes longer.
By the way I am curious, what kind of dataset are you testing those methods on? :) Gilles On 25 March 2012 03:49, Olivier Grisel <[email protected]> wrote: > Hi all, > > I have been playing a bit with GradientBoostingClassifier and > AdaBoostClassifier and ExtraTrees and while extra trees and adaboost > are reasonably fast to fit with there default params (n_estimators=10) > on a non toy dataset such as the olivetti faces dataset, the > GradientBoostingClassifier was taking ages (I killed it). > > The current default value is n_estimators=100 for > GradientBoostingClassifier. Maybe it should be aligned to > n_estimators=10 as in the other ensemble methods of the scikit? > > Or was I doing something very stupid by naively running it with the > default params on a dataset with size n_samples=400, n_features=4096 > and n_classes=40 without any kind of preprocessing? > > Another way to rephrase that question: what is the typical sweet spot > for the dataset shape when doing classification Gradient Boosted > Trees? What are reasonable values for the number of estimators in > various application domains? > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
