[R] Extracting parameters for Gamma Distribution
I'm doing a cox regression with frailty: model - coxph(Surv(Start,Stop,Terminated)~ X + frailty(id),table) I understand that model$frail returns the group level frailty terms. Does this mean this is the average of the frailty values for the respective groups? Also, if I'm fitting it to a gamma frailty, how do I extract the rate and scale parameters for the different gamma distributions fit to each group? Thanks. Yongchuan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when naming rows of dataset
I'm pretty new with R, so the only error message I see is the below that I pasted. I'm attaching the first few rows of the file for reference. The layout looks screwy when I attach it here but 'Start' to 'closingCoupon' is the first row in the .txt file. Thx! Start StopPrepayDate modBalance closingCoupon 1.1 6 7 0 811.27698.35 1.2 7 8 0 811.27698.35 1.3 8 9 1 811.27698.35 2.1 4 5 0 2226.0825 8.7 2.2 5 6 0 2226.0825 8.7 2.3 6 7 0 2226.0825 8.7 2.4 7 8 0 2226.0825 8.7 2.5 8 9 0 2226.0825 8.7 2.6 9 10 0 2226.0825 8.7 2.7 10 11 0 2226.0825 8.7 2.8 11 12 0 2226.0825 8.7 2.9 12 13 0 2226.0825 8.7 2.1 13 14 0 2226.0825 8.7 From: Michael Dewey [EMAIL PROTECTED] Date: Wed 25/10/2006 6:38 PM SGT To: yongchuan [EMAIL PROTECTED], r-help@stat.math.ethz.ch Subject: Re: [R] Error when naming rows of dataset At 17:30 24/10/2006, yongchuan wrote: I get the following error when I try reading in a table. How are 1.1, 1.2, 1.3 duplicate row names? Thx. R gives you brief details of where it was when it fell over. Have you checked in latestWithNumber.txt' to see whether R is right? table - read.table('latestWithNumber.txt', header=T) Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 1.2, 1.3, : duplicate 'row.names' are not allowed Yongchuan Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Incorrect 'n' returned by survfit()
I've a data set with 6 rows of data representing 6000+ distinct loans. I did a coxph() regression on it (see call below), but a subsequent survfit() call on the coxph object is almost certainly wrong. It gives n=6 when it should be more like 6000+ (I think) survfit(resultag) Call: survfit.coxph(object = resultag) n events median 0.95LCL 0.95UCL 6 489 Inf 2 Inf When I reduced the dataset to just 1000 rows, the survfit() call on the coxph object looks more correct. survfit(resulting) Call: survfit.coxph(object = resulting) n events median 0.95LCL 0.95UCL 115 15 Inf Inf Inf Is there a limit to the size of the data set that I read in? Or am I just doing something silly above? Thanks much. Yongchuan (this is the coxph regression: resultag - coxph(Surv(Start,Stop,PrepayDate)~modBalance + closingCoupon+lienPosition +originalFICO,table) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error when naming rows of dataset
I get the following error when I try reading in a table. How are 1.1, 1.2, 1.3 duplicate row names? Thx. table - read.table('latestWithNumber.txt', header=T) Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 1.2, 1.3, : duplicate 'row.names' are not allowed Yongchuan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Construction of Dataset for time varying COXPH analysis
Question: When survfit() function is used upon a coxph object, the 'n' returned is vastly smaller (n=6) than the number of distinct loans in the dataset used. I am trying to estimate a Cox proportional hazards model for a set of loans (over 6000) using using time varying covariates. For this 6000+ loans, I have some 62,000 different vectors representing the loans at different periods of time. I did the following: resultsOpt - coxph(Surv(Start,Stop,PrepayDate)~ closingCoupon + loanPurposeId, data=latest) which returned: Call: coxph(formula = Surv(Start, Stop, PrepayDate) ~ closingCoupon + loanPurposeId, data = latest) coef exp(coef) se(coef)z p closingCoupon 0.101 1.11 0.0271 3.73 1.9e-04 loanPurposeId 0.434 1.54 0.0624 6.96 3.3e-12 Likelihood ratio test=50.3 on 2 df, p=1.18e-11 n= 62297 which seems fair. However when I do: survfit(resultsOpt) Call: survfit.coxph(object = resultsOpt) n events median 0.95LCL 0.95UCL 6 489 Inf Inf Inf the n = 6 when the number of distinct loans in the dataset is more like 6554. My dataset looks like the following when I call it from within R: latest[1:5, 1:5] Start Stop PrepayDate modBalance closingCoupon 1 67 0 811.2769 8.35 2 78 0 811.2769 8.35 3 89 1 811.2769 8.35 4 45 0 2226.0825 8.70 5 56 0 2226.0825 8.70 where the first 3 rows present 1 loan, and the next 2 loans a new one. Am I putting the data in an incorrect format, and if so how should I correct it? Thanks much. Pan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.