Dear List,

Just curiosity (disclaimer: I never used random forests till now for more than a little playing around):

Is there no out-of-bag estimate available?
I mean, there are already ca. 1/e trees where a (one) given sample is out-of-bag, as Andy explained. If now the voting is done only over the oob trees, I should get a classical oob performance measure. Or is the oob estimate internally used up by some kind of optimization (what would that be, given that the trees are grown till the end?)?

Hoping that I do not spoil the pedagogic efforts of the list in teaching Ravishankar to do his homework reasoning himself...

Claudia

Am 23.10.2010 20:49, schrieb Changbin Du:
I think you should use 10 fold cross validation to judge your performance on
the validation parts. What you did will be overfitted for sure, you test on
the same training set used for your model buliding.


On Sat, Oct 23, 2010 at 6:39 AM, mxkuhn<mxk...@gmail.com>  wrote:

I think the issue is that you really can't use the training set to judge
this (without resampling).

For example, k nearest neighbors are not known to over fit, but  a 1nn
model will always perfectly predict the training data.

Max

On Oct 23, 2010, at 9:05 AM, "Liaw, Andy"<andy_l...@merck.com>  wrote:

What Breiman meant is that as the model gets more complex (i.e., as the
number of trees tends to infinity) the geneeralization error (test set
error) does not increase.  This does not hold for boosting, for example;
i.e., you can't "boost forever", which nececitate the need to find the
optimal number of iterations.  You don't need that with RF.

-----Original Message-----
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of vioravis
Sent: Saturday, October 23, 2010 12:15 AM
To: r-help@r-project.org
Subject: Re: [R] Random Forest AUC


Thanks Max and Andy. If the Random Forest is always giving an
AUC of 1, isn't
it over fitting??? If not, how do you differentiate this from over
fitting??? I believe Random forests are claimed to never over
fit (from the
following link).

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.ht<http://www.stat.berkeley.edu/%7Ebreiman/RandomForests/cc_home.ht>
m#features


Ravishankar R
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-AUC-tp3006649p3008157.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Notice:  This e-mail message, together with any attachme...{{dropped:11}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to