Re: [R] Predictor Importance in Random Forests and bootstrap

2014-01-28 Thread Dimitri Liakhovitski
Thank you, Bert. I'll definitely ask there. In the meantime I just wanted to ensure that my R code (my function for bootstrap and the bootstrap run) is correct and my abnormal bootstrap results are not a function of my erroneous code. Thank you! On Mon, Jan 27, 2014 at 7:09 PM, Bert Gunter

Re: [R] Predictor Importance in Random Forests and bootstrap

2014-01-28 Thread Dimitri Liakhovitski
Here is a great response I got from SO: There is an important difference between the two importance measures: MeanDecreaseAccuracy is calculated using out of bag (OOB) data, MeanDecreaseGini is not. For each tree MeanDecreaseAccuracy is calculated on observations not used to form that particular

Re: [R] Predictor Importance in Random Forests and bootstrap

2014-01-28 Thread Max Kuhn
I think that the fundamental problem is that you are using the default value of ntree (500). You should always use at least 1500 and more if n or p are large. Also, this link will give you more up-to-date information on that package and feature selection:

[R] Predictor Importance in Random Forests and bootstrap

2014-01-27 Thread Dimitri Liakhovitski
Hello! Below, I: 1. Create a data set with a bunch of factors. All of them are predictors and 'y' is the dependent variable. 2. I run a classification Random Forests run with predictor importance. I look at 2 measures of importance - MeanDecreaseAccuracy and MeanDecreaseGini 3. I run 2 boostrap

Re: [R] Predictor Importance in Random Forests and bootstrap

2014-01-27 Thread Bert Gunter
I **think** this kind of methodological issue might be better at SO (stats.stackexchange.com). It's not really about R programming, which is the main focus of this list. And yes, I know they do intersect. Nevertheless... Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650)