Dylan Beaudette wrote:
Hi,

I was hoping to clarify the exact behavior associated with this incantation:

validate(fit.ols, method='cross', B=50)

Output:

          index.orig training    test optimism index.corrected  n
R-square      0.5612   0.5613  0.5171   0.0442          0.5170 50
MSE           1.3090   1.3086  1.3547  -0.0462          1.3552 50
Intercept     0.0000   0.0000 -0.0040   0.0040         -0.0040 50
Slope         1.0000   1.0000  0.9899   0.0101          0.9899 50

Questions:
1. Does this perform 50 replicate, 10-fold CV operations?

Type ?validate

You are leaving out 1/50th of the rows of the data each time the model is fit.

If your sample size is not huge, you may need to average multiple runs of cross-validation to get adequate precision. The bootstrap is more efficient and a bit easier to do.

Note that if fit.ols was not a fully pre-specified model (e.g., if you did any variable selection) you are not using validate correctly and are getting biased estimates.


2. What do the slope and intercept terms refer to?

Estimated slope of Xnew*BETAold in predicting Ynew, i.e. slope of the calibration (reliability) curves. Likewise for the intercept.


3. How can I interpret the 'test R2' ?

It is a nearly unbiased estimate of R^2 to assess the likely future performance of the model on new data from the same data stream.

Frank



Thanks in advance!

Cheers,
Dylan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to