Re: [R] Prediction Error Calculation

2009-10-29 Thread quaildoc

Any help would be appreciated.

quaildoc wrote:
 
 Hello List,
 
 I am fitting a logistic regression model for some presence/absence type
 data.  I have numerous covariates I am fitting to explain variation, and I
 am using AIC to rank models.  However, I would like to report how well my
 best model (s) do at prediction.  I have looked over the archives and the
 web and have come up with something that gives me what I think is the mean
 prediction error, BUT I am not sure of that. I am sort of unfamiliar with
 these types of statistics.  Here is my code:
 
 
 metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
 family=binomial, data=metrics)## ##Type is my binary response 0 or 1
 
 muhat-metrics.global$fitted.values
 ##assigns the fitted values a name muhat
 global.diag-glm.diag(metrics.global)
 ##creates a the diagnostic values
 cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
 ###calculates cv.err
 cv.err
 
 
 My main problem is I am unsure how to interpret what cv.err means for my
 model.  I know that h is a leverage statistic for each observation.  I
 would appreciate some interpretation clarification.
 
 Thank you.
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26113145.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction Error Calculation

2009-10-26 Thread quaildoc

Any suggestions?

quaildoc wrote:
 
 Hello List,
 
 I am fitting a logistic regression model for some presence/absence type
 data.  I have numerous covariates I am fitting to explain variation, and I
 am using AIC to rank models.  However, I would like to report how well my
 best model (s) do at prediction.  I have looked over the archives and the
 web and have come up with something that gives me what I think is the mean
 prediction error, BUT I am not sure of that. I am sort of unfamiliar with
 these types of statistics.  Here is my code:
 
 
 metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
 family=binomial, data=metrics)## ##Type is my binary response 0 or 1
 
 muhat-metrics.global$fitted.values
 ##assigns the fitted values a name muhat
 global.diag-glm.diag(metrics.global)
 ##creates a the diagnostic values
 cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
 ###calculates cv.err
 cv.err
 
 
 My main problem is I am unsure how to interpret what cv.err means for my
 model.  I know that h is a leverage statistic for each observation.  I
 would appreciate some interpretation clarification.
 
 Thank you.
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26066845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Prediction Error Calculation

2009-10-23 Thread quaildoc

Hello List,

I am fitting a logistic regression model for some presence/absence type
data.  I have numerous covariates I am fitting to explain variation, and I
am using AIC to rank models.  However, I would like to report how well my
best model (s) do at prediction.  I have looked over the archives and the
web and have come up with something that gives me what I think is the mean
prediction error, BUT I am not sure of that. I am sort of unfamiliar with
these types of statistics.  Here is my code:


metrics.global-glm(Type~MPI+IJI+ED+PRD+class2+class3+class5,
family=binomial, data=metrics)## ##Type is my binary response 0 or 1

muhat-metrics.global$fitted.values
##assigns the fitted values a name muhat
global.diag-glm.diag(metrics.global)
##creates a the diagnostic values
cv.err-mean((metrics.global$y-muhat)^2/(1-global.diag$h)^2)
###calculates cv.err
cv.err


My main problem is I am unsure how to interpret what cv.err means for my
model.  I know that h is a leverage statistic for each observation.  I would
appreciate some interpretation clarification.

Thank you.




-- 
View this message in context: 
http://www.nabble.com/Prediction-Error-Calculation-tp26031236p26031236.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Dependent Cox Model

2009-10-14 Thread quaildoc

Does anyone have suggestions? Thanks!

quaildoc wrote:
 
 I am having trouble formatting some survival data to use in a time
 dependent cox model. My time dep. variable is habitat and I have it
 recorded for every day (with some NAs).  I think it is working properly
 except for calculating the death.time. This column should be 1s or 0s and
 as I have it only produces 0s.  Any help will be greatly appreciated.
 
 
  http://www.nabble.com/file/p25881478/Survival_master2.csv
 Survival_master2.csv 
 
 
 
  Here is my code:
 sum(!is.na(surv[,16:726]))
 
 surv2-matrix(0,12329,19)
 colnames(surv2)-c('start', 'stop', 'death.time',
 names(surv)[1:15],'habitat')
 row-0 # set record counter to 0
 for (i in 1:nrow(surv)) { # loop over individuals
 for (j in 16:726) { # loop over 726 days
   if (is.na(surv[i, j])) next # skip missing data
   else {
 row - row + 1 # increment row counter
 start - j - 11 # start time (previous day)
 stop - start + 1 # stop time (day)
 death.time - if (stop == surv[i, 4]  surv[i, 5] ==1) 1 else
 0
# construct record:
 surv2[row,] - c(start, stop, death.time, unlist(surv[i,
 c(1:15, j)]))
 }
 }
}
 surv2-as.data.frame(surv2)
 

-- 
View this message in context: 
http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Dependent Cox Model

2009-10-14 Thread quaildoc

Some suggested that go into more detail on what I wanted to accomplish and
the rest of my code.  I want to accomplish exactly what Fox did in this
article( http://www.nabble.com/file/p25897307/appendix-cox-regression.pdf
appendix-cox-regression.pdf ) (starting with page 7), except using habitat
instead of employment. I want habitat to be a time dep. covariate and it
varys by day.

I read in my data as the csv. file, and one major difference in the data set
Fox used and min is I have a DaysatRisk column instead of the week the
person went back to jail. This I think is the root of my problem calculating
the proper death.time.  The death.time column should be 1s and 0s that
corresponds to the day the animal died. 

Thanks in advance,



surv-read.csv(Survival_master2.csv, header = TRUE)

sum(!is.na(surv[,16:726]))

surv2-matrix(0,12329,19)
colnames(surv2)-c('start', 'stop', 'death.time',
names(surv)[1:15],'habitat')
row-0 # set record counter to 0
for (i in 1:nrow(surv)) { # loop over individuals
for (j in 16:726) { # loop over 52 weeks
  if (is.na(surv[i, j])) next # skip missing data
  else {
row - row + 1 # increment row counter
start - j - 11 # start time (previous week)
stop - start + 1 # stop time (current week)
death.time - if (stop == surv[i, 4]  surv[i, 5] ==1) 1 else 0
   # construct record:
surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15,
j)]))
}
}
   }
surv2-as.data.frame(surv2)
remove(i,j,row,start,stop,death.time)

surv2[1:15,]

test-coxph(Surv(start,stop,death.time)~habitat, data=surv2)


JorisMeys wrote:
 
 Well,
 
 it might be wise to elaborate a bit more about the variables and what
 exactly you want e.g. death-time to be. I'd interprete it as time of
 death, but the fact that it is 0/1, means it is a logical (?) binary
 variable of some sort.
 
 Please ask your question in such a way that somebody who doesn't know
 the dataset and your research, can still understand what is inside the
 dataset and what exactly you're trying to obtain.
 
 I'd also suggest to add the command to read in the data. I don't have
 the time to spend looking around how exactly I can read in the dataset
 in such a way it fits what you have in your workspace.
 
 Cheers
 Joris
 
 On Wed, Oct 14, 2009 at 5:37 PM, quaildoc just.strut...@gmail.com wrote:

 Does anyone have suggestions? Thanks!

 quaildoc wrote:

 I am having trouble formatting some survival data to use in a time
 dependent cox model. My time dep. variable is habitat and I have it
 recorded for every day (with some NAs).  I think it is working properly
 except for calculating the death.time. This column should be 1s or 0s
 and
 as I have it only produces 0s.  Any help will be greatly appreciated.


  http://www.nabble.com/file/p25881478/Survival_master2.csv
 Survival_master2.csv



  Here is my code:
 sum(!is.na(surv[,16:726]))

 surv2-matrix(0,12329,19)
 colnames(surv2)-c('start', 'stop', 'death.time',
 names(surv)[1:15],'habitat')
 row-0 # set record counter to 0
     for (i in 1:nrow(surv)) { # loop over individuals
         for (j in 16:726) { # loop over 726 days
           if (is.na(surv[i, j])) next # skip missing data
           else {
             row - row + 1 # increment row counter
             start - j - 11 # start time (previous day)
             stop - start + 1 # stop time (day)
             death.time - if (stop == surv[i, 4]  surv[i, 5] ==1) 1
 else
 0
                    # construct record:
             surv2[row,] - c(start, stop, death.time, unlist(surv[i,
 c(1:15, j)]))
             }
         }
    }
 surv2-as.data.frame(surv2)


 --
 View this message in context:
 http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25893488.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25897307.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Time Dependent Cox Model

2009-10-13 Thread quaildoc

I am having trouble formatting some survival data to use in a time dependent
cox model. My time dep. variable is habitat and I have it recorded for every
day (with some NAs).  I think it is working properly except for calculating
the death.time. This column should be 1s or 0s and as I have it only
produces 0s.  Any help will be greatly appreciated.


http://www.nabble.com/file/p25881478/Survival_master2.csv
Survival_master2.csv 



 Here is my code:
sum(!is.na(surv[,16:726]))

surv2-matrix(0,12329,19)
colnames(surv2)-c('start', 'stop', 'death.time',
names(surv)[1:15],'habitat')
row-0 # set record counter to 0
for (i in 1:nrow(surv)) { # loop over individuals
for (j in 16:726) { # loop over 726 days
  if (is.na(surv[i, j])) next # skip missing data
  else {
row - row + 1 # increment row counter
start - j - 11 # start time (previous day)
stop - start + 1 # stop time (day)
death.time - if (stop == surv[i, 4]  surv[i, 5] ==1) 1 else 0
   # construct record:
surv2[row,] - c(start, stop, death.time, unlist(surv[i, c(1:15,
j)]))
}
}
   }
surv2-as.data.frame(surv2)
-- 
View this message in context: 
http://www.nabble.com/Time-Dependent-Cox-Model-tp25881478p25881478.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.