I hope it helps.
Best, Dimitris
---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
----- Original Message ----- From: "Trevor Wiens" <[EMAIL PROTECTED]>
To: "Prof Brian Ripley" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Tuesday, March 15, 2005 4:59 PM
Subject: Re: [R] cv.glm {boot}
On Tue, 15 Mar 2005 07:05:49 +0000 (GMT) Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
Cross-validation assumes exchangeability of units. You can easily write
your own code (lots of examples in MASS), but first you would need to
prove the validity of what you are attempting. For example, dropping
chunks in the middle of a time series is not valid unless your prediction
somehow takes the temporal structure into account (and glm does not).
Yes, I'm aware of that and I do have a number of predictors which vary with time (from year to year such as precipitation or properly timed vegetation indices from each year....) so that isn't my problem. Also my spatial blocking is also valid (distinct partitions of the study area). I'm also aware of the problems of spatial autocorrelation and have taken some measures to deal with that. I am however rather new at R and not a statistician, so I am heavily reliant on books such as Hosmer and Lemeshow or Manley(Resource selection by Animals) on procedure. Unforunately, they are not S-plus or R oriented so I have some difficulty translating those ideas to R.
You mention lots of examples in MASS regarding cross-validation, but I can't find them. Perhaps I'm looking in the wrong spot. I've done help.search('validation'), .... and found nothing that seemed obviously applicable to my problem. I suppose I should pick up a copy of your books which would probably be very helpful. However, if it isn't too much trouble. I would really appreciate a bit more direct help.
This is what I assumed I would do somethink like this (in this example basp = Baird's Sparrow presence or absence)
train <- birddata[birddata$recordyear != 2000]
test <- birddata[birddata$recordyear == 2000]
train.glm <- glm(basp ~ elev + slope + precip + precip_1 ..., data=birddata, family=binomial)
pred <- predict(train.glm, newdata=test, type='response')
actual <- test$basp
what happens next??
Thanks in advance.
T -- Trevor Wiens [EMAIL PROTECTED]
The significant problems that we face cannot be solved at the same level of thinking we were at when we created them. (Albert Einstein)
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
