[R] Extracting parameters for Gamma Distribution

2006-11-07 Thread yongchuan
I'm doing a cox regression with frailty:

model - coxph(Surv(Start,Stop,Terminated)~ X + frailty(id),table)

I understand that model$frail returns the group level frailty 
terms. Does this mean this is the average of the frailty 
values for the respective groups? Also, if I'm fitting it to
a gamma frailty, how do I extract the rate and scale 
parameters for the different gamma distributions fit to each
group? Thanks.

Yongchuan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error when naming rows of dataset

2006-10-25 Thread yongchuan
I'm pretty new with R, so the only error message I see is
the below that I pasted. I'm attaching the first few rows
of the file for reference. The layout looks screwy when I
attach it here but 'Start' to 'closingCoupon' is the first
row in the .txt file. Thx!

Start   StopPrepayDate  modBalance  closingCoupon
1.1 6   7   0   811.27698.35
1.2 7   8   0   811.27698.35
1.3 8   9   1   811.27698.35
2.1 4   5   0   2226.0825   8.7
2.2 5   6   0   2226.0825   8.7
2.3 6   7   0   2226.0825   8.7
2.4 7   8   0   2226.0825   8.7
2.5 8   9   0   2226.0825   8.7
2.6 9   10  0   2226.0825   8.7
2.7 10  11  0   2226.0825   8.7
2.8 11  12  0   2226.0825   8.7
2.9 12  13  0   2226.0825   8.7
2.1 13  14  0   2226.0825   8.7

 
 From: Michael Dewey [EMAIL PROTECTED]
 Date: Wed 25/10/2006 6:38 PM SGT
 To: yongchuan [EMAIL PROTECTED], r-help@stat.math.ethz.ch
 Subject: Re: [R] Error when naming rows of dataset
 
 At 17:30 24/10/2006, yongchuan wrote:
 I get the following error when I try reading in a table.
 How are 1.1, 1.2, 1.3 duplicate row names? Thx.
 
 R gives you brief details of where it was when it fell over.
 Have you checked in latestWithNumber.txt' to see whether R is right?
 
 
   table - read.table('latestWithNumber.txt', header=T)
 Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 1.2, 1.3,  :
  duplicate 'row.names' are not allowed
 
 Yongchuan
 
 Michael Dewey
 http://www.aghmed.fsnet.co.uk
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Incorrect 'n' returned by survfit()

2006-10-25 Thread yongchuan
I've a data set with 6 rows of data representing 6000+ distinct loans. I 
did a coxph() regression on it (see call below), but a subsequent survfit() 
call on the coxph object is almost certainly wrong. It gives n=6 when it should 
be 
more like 6000+ (I think)

 survfit(resultag)
Call: survfit.coxph(object = resultag)

  n  events  median 0.95LCL 0.95UCL 
  6 489 Inf   2 Inf 

When I reduced the dataset to just 1000 rows, the survfit()
call on the coxph object looks more correct. 

 survfit(resulting)
Call: survfit.coxph(object = resulting)

  n  events  median 0.95LCL 0.95UCL 
115  15 Inf Inf Inf 

Is there a limit to the size of the data set that I read in?
Or am I just doing something silly above?

Thanks much.
Yongchuan

(this is the coxph regression:
resultag - coxph(Surv(Start,Stop,PrepayDate)~modBalance + 
closingCoupon+lienPosition +originalFICO,table)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error when naming rows of dataset

2006-10-24 Thread yongchuan
I get the following error when I try reading in a table.
How are 1.1, 1.2, 1.3 duplicate row names? Thx.

 table - read.table('latestWithNumber.txt', header=T)
Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 1.2, 1.3,  : 
duplicate 'row.names' are not allowed

Yongchuan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Construction of Dataset for time varying COXPH analysis

2006-10-23 Thread yongchuan
Question: When survfit() function is used upon a coxph object, the 'n' returned 
is vastly smaller (n=6) than the number of distinct loans in the dataset used. 

I am trying to estimate a Cox proportional hazards model for a set of loans 
(over 6000) using using time varying covariates. For this 6000+ loans, I have 
some 62,000 different vectors representing the loans at different periods of 
time. I did the following:

resultsOpt - coxph(Surv(Start,Stop,PrepayDate)~ closingCoupon + loanPurposeId, 
data=latest)

which returned:

Call:
coxph(formula = Surv(Start, Stop, PrepayDate) ~ closingCoupon + 
loanPurposeId, data = latest)


   coef exp(coef) se(coef)z   p
closingCoupon 0.101  1.11   0.0271 3.73 1.9e-04
loanPurposeId 0.434  1.54   0.0624 6.96 3.3e-12

Likelihood ratio test=50.3  on 2 df, p=1.18e-11  n= 62297 


which seems fair.


However when I do:

 survfit(resultsOpt)
Call: survfit.coxph(object = resultsOpt)

  n  events  median 0.95LCL 0.95UCL 
  6 489 Inf Inf Inf 

the n = 6 when the number of distinct loans in the dataset is more like 6554.

My dataset looks like the following when I call it from within R:

 latest[1:5, 1:5]
  Start Stop PrepayDate modBalance closingCoupon
1 67  0   811.2769  8.35
2 78  0   811.2769  8.35
3 89  1   811.2769  8.35
4 45  0  2226.0825  8.70
5 56  0  2226.0825  8.70


where the first 3 rows present 1 loan, and the next 2 loans a new one. Am I 
putting the data in an incorrect format, and if so how should I correct it? 
Thanks much.

Pan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.