Dear Experts,

I'm a new R user and I'll appreciate your help regarding the following. I'm
trying to generate an exhaustive search of all candidate models in a simple
linear regression and select the one with the lowest CV-error (or
alternatively the lowest Error on a Test set -- if I have lots of data). The
leaps package can generate this exhaustive search but all models are
evaluated on the train data (without cross-validation). How can I implement
what I'm trying to achieve?  Any guidance will help...


library(ElemStatLearn) #Follow the example of Page 58 in Elements of Stat
Learning Book

train <- subset(prostate, train==TRUE )[,1:9]
test <- subset(prostate, train=FALSE )[,1:9]

#Best subset selection
library(leaps)
prostate.leaps <- regsubsets( lpsa ~ . , data=train, nbest=100, nvmax=8,
method="exhaustive", really.big=T)
prostate.leaps.sum <- summary(prostate.leaps)
prostate.models <- prostate.leaps.sum$which
prostate.models
prostate.models.rss <- prostate.leaps.sum$rss
prostate.models.rss
prostate.models.size <- as.numeric(attr(prostate.models, "dimnames")[[1]])
prostate.models.best.rss <-tapply(prostate.models.rss, prostate.models.size,
min)
prostate.models.best.rss

Thanks a lot!

Lars.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to