Olivier, In my experience GBRT usually requires more base learners than random forests to get the same level of accuracy. I hardly use less than 100. Regarding the poor performance of GBRT on the olivetti dataset: multi-class GBRT fits ``k`` trees at each stage, thus, if you have ``n_estimators`` this means you have to grow ``k * n_estimators`` trees in total (4000 trees is quite a lot :-) ). Personally, I haven't used multi-class GBRT much (part of the reason is that GBM does not support it) - I know that the learning to rank folks use multi-class GBRT for ordinal scaled output values (e.g. "not-relevant", "relevant", "highly relevant") but these involve usually less than 5 classes.
That said, the major drawback of GBRT is computational complexity: we could speed up multi-class GBRT by training the ``k`` trees at each stage in parallel but still it is much less efficient than random forests. Maybe Scott (on CC) can comment on this as well - he has worked on the multi-class support in GBRT and knows much more about it. @Olivier: it would be great if you could send me your benchmark script so that I can look into the issue in more detail. thanks, Peter 2012/3/25 Gilles Louppe <[email protected]>: > Hi Olivier, > > The higher the number of estimators, the better. The more random the > trees (e.g., the lower max_features), the more important it usually is > to have a large forest to decrease the variance. To me, 10 is actually > a very low default value. In my daily research, I deal with hundreds > of trees. But yeah, it also takes longer. > > By the way I am curious, what kind of dataset are you testing those > methods on? :) > > Gilles > > On 25 March 2012 03:49, Olivier Grisel <[email protected]> wrote: >> Hi all, >> >> I have been playing a bit with GradientBoostingClassifier and >> AdaBoostClassifier and ExtraTrees and while extra trees and adaboost >> are reasonably fast to fit with there default params (n_estimators=10) >> on a non toy dataset such as the olivetti faces dataset, the >> GradientBoostingClassifier was taking ages (I killed it). >> >> The current default value is n_estimators=100 for >> GradientBoostingClassifier. Maybe it should be aligned to >> n_estimators=10 as in the other ensemble methods of the scikit? >> >> Or was I doing something very stupid by naively running it with the >> default params on a dataset with size n_samples=400, n_features=4096 >> and n_classes=40 without any kind of preprocessing? >> >> Another way to rephrase that question: what is the typical sweet spot >> for the dataset shape when doing classification Gradient Boosted >> Trees? What are reasonable values for the number of estimators in >> various application domains? >> >> -- >> Olivier >> http://twitter.com/ogrisel - http://github.com/ogrisel >> >> ------------------------------------------------------------------------------ >> This SF email is sponsosred by: >> Try Windows Azure free for 90 days Click Here >> http://p.sf.net/sfu/sfd2d-msazure >> _______________________________________________ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Peter Prettenhofer ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
