Hi Olivier,

The higher the number of estimators, the better. The more random the
trees (e.g., the lower max_features), the more important it usually is
to have a large forest to decrease the variance. To me, 10 is actually
a very low default value. In my daily research, I deal with hundreds
of trees. But yeah, it also takes longer.

By the way I am curious, what kind of dataset are you testing those
methods on? :)

Gilles

On 25 March 2012 03:49, Olivier Grisel <[email protected]> wrote:
> Hi all,
>
> I have been playing a bit with GradientBoostingClassifier and
> AdaBoostClassifier and ExtraTrees and while extra trees and adaboost
> are reasonably fast to fit with there default params (n_estimators=10)
> on a non toy dataset such as the olivetti faces dataset, the
> GradientBoostingClassifier was taking ages (I killed it).
>
> The current default value is n_estimators=100 for
> GradientBoostingClassifier. Maybe it should be aligned to
> n_estimators=10 as in the other ensemble methods of the scikit?
>
> Or was I doing something very stupid by naively running it with the
> default params on a dataset with size n_samples=400, n_features=4096
> and n_classes=40 without any kind of preprocessing?
>
> Another way to rephrase that question: what is the typical sweet spot
> for the dataset shape when doing classification Gradient Boosted
> Trees? What are reasonable values for the number of estimators in
> various application domains?
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to