Re: [R] Prediction Error Calculation
Any help would be appreciated. quaildoc wrote: Hello List, I am fitting a logistic regression model for some presence/absence type data. I have numerous covariates I am fitting to explain variation, and I am using AIC to rank models. However, I would like to report how well my best model (s) do at prediction. I have looked over the archives and the web and have come up with something that gives me what I think is the mean prediction error, BUT I am not sure of that. I am sort of unfamiliar with these types of statistics. Here is my code: metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5, family=binomial, data=metrics)## ##Type is my binary response 0 or 1 muhat-metrics.global$fitted.values ##assigns the fitted values a name muhat global.diag-glm.diag(metrics.global) ##creates a the diagnostic values cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2) ###calculates cv.err cv.err My main problem is I am unsure how to interpret what cv.err means for my model. I know that h is a leverage statistic for each observation. I would appreciate some interpretation clarification. Thank you. -- View this message in context: http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26113145.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediction Error Calculation
Any suggestions? quaildoc wrote: Hello List, I am fitting a logistic regression model for some presence/absence type data. I have numerous covariates I am fitting to explain variation, and I am using AIC to rank models. However, I would like to report how well my best model (s) do at prediction. I have looked over the archives and the web and have come up with something that gives me what I think is the mean prediction error, BUT I am not sure of that. I am sort of unfamiliar with these types of statistics. Here is my code: metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5, family=binomial, data=metrics)## ##Type is my binary response 0 or 1 muhat-metrics.global$fitted.values ##assigns the fitted values a name muhat global.diag-glm.diag(metrics.global) ##creates a the diagnostic values cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2) ###calculates cv.err cv.err My main problem is I am unsure how to interpret what cv.err means for my model. I know that h is a leverage statistic for each observation. I would appreciate some interpretation clarification. Thank you. -- View this message in context: http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26066845.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Prediction Error Calculation
Hello List, I am fitting a logistic regression model for some presence/absence type data. I have numerous covariates I am fitting to explain variation, and I am using AIC to rank models. However, I would like to report how well my best model (s) do at prediction. I have looked over the archives and the web and have come up with something that gives me what I think is the mean prediction error, BUT I am not sure of that. I am sort of unfamiliar with these types of statistics. Here is my code: metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5, family=binomial, data=metrics)## ##Type is my binary response 0 or 1 muhat-metrics.global$fitted.values ##assigns the fitted values a name muhat global.diag-glm.diag(metrics.global) ##creates a the diagnostic values cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2) ###calculates cv.err cv.err My main problem is I am unsure how to interpret what cv.err means for my model. I know that h is a leverage statistic for each observation. I would appreciate some interpretation clarification. Thank you. -- View this message in context: http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26031236.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Dependent Cox Model
Does anyone have suggestions? Thanks! quaildoc wrote: I am having trouble formatting some survival data to use in a time dependent cox model. My time dep. variable is habitat and I have it recorded for every day (with some NAs). I think it is working properly except for calculating the death.time. This column should be 1s or 0s and as I have it only produces 0s. Any help will be greatly appreciated. http://www.nabble.com/file/p25881478/Survival_master2.csv Survival_master2.csv Here is my code: sum(!is.na(surv[,16:726])) surv2-matrix(0,12329,19) colnames(surv2)-c('start', 'stop', 'death.time', names(surv)[1:15],'habitat') row-0 # set record counter to 0 for (i in 1:nrow(surv)) { # loop over individuals for (j in 16:726) { # loop over 726 days if (is.na(surv[i, j])) next # skip missing data else { row - row + 1 # increment row counter start - j - 11 # start time (previous day) stop - start + 1 # stop time (day) death.time - if (stop == surv[i, 4] surv[i, 5] ==1) 1 else 0 # construct record: surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15, j)])) } } } surv2-as.data.frame(surv2) -- View this message in context: http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Dependent Cox Model
Some suggested that go into more detail on what I wanted to accomplish and the rest of my code. I want to accomplish exactly what Fox did in this article( http://www.nabble.com/file/p25897307/appendix-cox-regression.pdf appendix-cox-regression.pdf ) (starting with page 7), except using habitat instead of employment. I want habitat to be a time dep. covariate and it varys by day. I read in my data as the csv. file, and one major difference in the data set Fox used and min is I have a DaysatRisk column instead of the week the person went back to jail. This I think is the root of my problem calculating the proper death.time. The death.time column should be 1s and 0s that corresponds to the day the animal died. Thanks in advance, surv-read.csv(Survival_master2.csv, header = TRUE) sum(!is.na(surv[,16:726])) surv2-matrix(0,12329,19) colnames(surv2)-c('start', 'stop', 'death.time', names(surv)[1:15],'habitat') row-0 # set record counter to 0 for (i in 1:nrow(surv)) { # loop over individuals for (j in 16:726) { # loop over 52 weeks if (is.na(surv[i, j])) next # skip missing data else { row - row + 1 # increment row counter start - j - 11 # start time (previous week) stop - start + 1 # stop time (current week) death.time - if (stop == surv[i, 4] surv[i, 5] ==1) 1 else 0 # construct record: surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15, j)])) } } } surv2-as.data.frame(surv2) remove(i,j,row,start,stop,death.time) surv2[1:15,] test-coxph(Surv(start,stop,death.time)~habitat, data=surv2) JorisMeys wrote: Well, it might be wise to elaborate a bit more about the variables and what exactly you want e.g. death-time to be. I'd interprete it as time of death, but the fact that it is 0/1, means it is a logical (?) binary variable of some sort. Please ask your question in such a way that somebody who doesn't know the dataset and your research, can still understand what is inside the dataset and what exactly you're trying to obtain. I'd also suggest to add the command to read in the data. I don't have the time to spend looking around how exactly I can read in the dataset in such a way it fits what you have in your workspace. Cheers Joris On Wed, Oct 14, 2009 at 5:37 PM, quaildoc just.strut...@gmail.com wrote: Does anyone have suggestions? Thanks! quaildoc wrote: I am having trouble formatting some survival data to use in a time dependent cox model. My time dep. variable is habitat and I have it recorded for every day (with some NAs). I think it is working properly except for calculating the death.time. This column should be 1s or 0s and as I have it only produces 0s. Any help will be greatly appreciated. http://www.nabble.com/file/p25881478/Survival_master2.csv Survival_master2.csv Here is my code: sum(!is.na(surv[,16:726])) surv2-matrix(0,12329,19) colnames(surv2)-c('start', 'stop', 'death.time', names(surv)[1:15],'habitat') row-0 # set record counter to 0 for (i in 1:nrow(surv)) { # loop over individuals for (j in 16:726) { # loop over 726 days if (is.na(surv[i, j])) next # skip missing data else { row - row + 1 # increment row counter start - j - 11 # start time (previous day) stop - start + 1 # stop time (day) death.time - if (stop == surv[i, 4] surv[i, 5] ==1) 1 else 0 # construct record: surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15, j)])) } } } surv2-as.data.frame(surv2) -- View this message in context: http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25897307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Dependent Cox Model
I am having trouble formatting some survival data to use in a time dependent cox model. My time dep. variable is habitat and I have it recorded for every day (with some NAs). I think it is working properly except for calculating the death.time. This column should be 1s or 0s and as I have it only produces 0s. Any help will be greatly appreciated. http://www.nabble.com/file/p25881478/Survival_master2.csv Survival_master2.csv Here is my code: sum(!is.na(surv[,16:726])) surv2-matrix(0,12329,19) colnames(surv2)-c('start', 'stop', 'death.time', names(surv)[1:15],'habitat') row-0 # set record counter to 0 for (i in 1:nrow(surv)) { # loop over individuals for (j in 16:726) { # loop over 726 days if (is.na(surv[i, j])) next # skip missing data else { row - row + 1 # increment row counter start - j - 11 # start time (previous day) stop - start + 1 # stop time (day) death.time - if (stop == surv[i, 4] surv[i, 5] ==1) 1 else 0 # construct record: surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15, j)])) } } } surv2-as.data.frame(surv2) -- View this message in context: http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25881478.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.