Thanks to you all!
Now I got it!
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-Cross-Validation-tp3314777p3327384.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
steps
such as feature selections, all bets are off.
Andy
-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of mxkuhn
Sent: Tuesday, February 22, 2011 7:17 PM
To: ronzhao
Cc: r-help@r-project.org
Subject: Re: [R] Random Forest Cross
Thanks, Max.
Yes, I did some feature selections in the training set. Basically, I
selected the top 1000 SNPs based on OOB error and grow the forest using
training set, then using the test set to validate the forest grown.
But if I do the same thing in test set, the top SNPs would be different
If you want to get honest estimates of accuracy, you should repeat the feature
selection within the resampling (not the test set). You will get different
lists each time, but that's the point. Right now you are not capturing that
uncertainty which is why the oob and test set results differ so
I am using randomForest package to do some prediction job on GWAS data. I
firstly split the data into training and testing set (70% vs 30%), then
using training set to grow the trees (ntree=10). It looks that the OOB
error in training set is good (10%). However, it is not very good for the
Hi,
I am using randomForest package to do some prediction job on GWAS data. I
firstly split the data into training and testing set (70% vs 30%), then
using training set to grow the trees (ntree=10). It looks that the OOB
error in training set is good (10%). However, it is not very good for
6 matches
Mail list logo