Dylan Beaudette wrote:
Hi,
I was hoping to clarify the exact behavior associated with this incantation:
validate(fit.ols, method='cross', B=50)
Output:
index.orig training test optimism index.corrected n
R-square 0.5612 0.5613 0.5171 0.0442 0.5170 50
MSE 1.3090 1.3086 1.3547 -0.0462 1.3552 50
Intercept 0.0000 0.0000 -0.0040 0.0040 -0.0040 50
Slope 1.0000 1.0000 0.9899 0.0101 0.9899 50
Questions:
1. Does this perform 50 replicate, 10-fold CV operations?
Type ?validate
You are leaving out 1/50th of the rows of the data each time the model
is fit.
If your sample size is not huge, you may need to average multiple runs
of cross-validation to get adequate precision. The bootstrap is more
efficient and a bit easier to do.
Note that if fit.ols was not a fully pre-specified model (e.g., if you
did any variable selection) you are not using validate correctly and are
getting biased estimates.
2. What do the slope and intercept terms refer to?
Estimated slope of Xnew*BETAold in predicting Ynew, i.e. slope of the
calibration (reliability) curves. Likewise for the intercept.
3. How can I interpret the 'test R2' ?
It is a nearly unbiased estimate of R^2 to assess the likely future
performance of the model on new data from the same data stream.
Frank
Thanks in advance!
Cheers,
Dylan
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.