Re: [R] survival survfit with newdata

2012-05-17 Thread Damjan Krstajic

Thanks David for prompt reply. I agree with you. However, I still fail to get 
the survfit function to work with newdata. In my previous example I changed the 
column names of testX matrix and I still fail. 

 colnames(testX)-names(coxph.model$coefficients)
 sfit- survfit(coxph.model,newdata=data.frame(testX))
Error in model.frame.default(formula = Surv(trainTime, trainStatus) ~  :
  variable lengths differ (found for 'trainX')

What would be solution in my simple example to get the survival curves for 
testX? Thanks in advance. DK

 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: dkrsta...@hotmail.com
 Subject: Re: [R] survival survfit with newdata
 Date: Thu, 17 May 2012 00:52:55 -0400
 
 
 On May 16, 2012, at 5:08 PM, Damjan Krstajic wrote:
 
 
  Dear all,
 
  I am confused with the behaviour of survfit with newdata option.
 
 Yes. It has the same behavior as any other newdata/predict from  
 regression. You need to supply a dataframe with the same names as in  
 the original formula. Doesn't look as though that strategy is being  
 followed. The name of the column needs to be 'trainX' since that was  
 what was its name on the RHS of hte formula,  and you may want to  
 specify times. If you fail to follow those rules, the function falls  
 back on offering estimates from the original data.
 
 
  I am using the latest version R-2-15-0. In the simple example below  
  I am building a coxph model on 90 patients and trying to predict 10  
  patients. Unfortunately the survival curve at the end is for 90  
  patients.
 
 As is proper with a malformed newdata argument.
 
  Could somebody please from the survival package confirm that this  
  behaviour is as expected or not - because I cannot find a way of  
  using 'newdata' with really new data. Thanks in advance. DK
 
  x-matrix(rnorm(100*20),100,20)
 
 
  time-runif(100,min=0,max=7)
 
 
  status-sample(c(0,1), 100, replace = TRUE)
  trainX-x[11:100,]
 
  trainTime-time[11:100]
 
  trainStatus-status[11:100]
 
  testX-x[1:10,]
  coxph.model-
  coxph(Surv(trainTime,trainStatus)~ trainX)
  sfit- survfit(coxph.model,newdata=data.frame(testX))
 
 
  dim(sfit$surv)
 
  [1] 90 90
 
 
  
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival survfit with newdata

2012-05-17 Thread David Winsemius


On May 17, 2012, at 2:20 AM, Damjan Krstajic wrote:



Thanks David for prompt reply. I agree with you. However, I still  
fail to get the survfit function to work with newdata. In my  
previous example I changed the column names of testX matrix and I  
still fail.



colnames(testX)-names(coxph.model$coefficients)
sfit- survfit(coxph.model,newdata=data.frame(testX))
Error in model.frame.default(formula = Surv(trainTime, trainStatus)  
~  :

  variable lengths differ (found for 'trainX')


I don't get that error when I run this. I do get better results using  
a data argument to the coxph call. You should be getting predicted  
survival curves for 10 cases that will be estimated at the same time  
points as were available in the input data in the original data.


 coxph.model-coxph(Surv(trainTime,trainStatus)~ . ,  
data=data.frame(trainX))

 colnames(testX)-names(coxph.model$coefficients)
 sfit- survfit(coxph.model,newdata=data.frame(testX))
 plot(sfit)  # 10 curves

I do not see matrix input to coxph as a described data input, so  
perhaps you should follow the help page more closely?


--
David.



What would be solution in my simple example to get the survival  
curves for testX? Thanks in advance. DK



CC: r-help@r-project.org
From: dwinsem...@comcast.net
To: dkrsta...@hotmail.com
Subject: Re: [R] survival survfit with newdata
Date: Thu, 17 May 2012 00:52:55 -0400


On May 16, 2012, at 5:08 PM, Damjan Krstajic wrote:



Dear all,

I am confused with the behaviour of survfit with newdata option.


Yes. It has the same behavior as any other newdata/predict from
regression. You need to supply a dataframe with the same names as in
the original formula. Doesn't look as though that strategy is being
followed. The name of the column needs to be 'trainX' since that was
what was its name on the RHS of hte formula,  and you may want to
specify times. If you fail to follow those rules, the function falls
back on offering estimates from the original data.



I am using the latest version R-2-15-0. In the simple example below
I am building a coxph model on 90 patients and trying to predict 10
patients. Unfortunately the survival curve at the end is for 90
patients.


As is proper with a malformed newdata argument.


Could somebody please from the survival package confirm that this
behaviour is as expected or not - because I cannot find a way of
using 'newdata' with really new data. Thanks in advance. DK


x-matrix(rnorm(100*20),100,20)





time-runif(100,min=0,max=7)




status-sample(c(0,1), 100, replace = TRUE)

trainX-x[11:100,]


trainTime-time[11:100]



trainStatus-status[11:100]



testX-x[1:10,]

coxph.model-

coxph(Surv(trainTime,trainStatus)~ trainX)

sfit- survfit(coxph.model,newdata=data.frame(testX))





dim(sfit$surv)

[1] 90 90



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survival survfit with newdata

2012-05-16 Thread Damjan Krstajic

Dear all,

I am confused with the behaviour of survfit with newdata option.

I am using the latest version R-2-15-0. In the simple example below I am 
building a coxph model on 90 patients and trying to predict 10 patients. 
Unfortunately the survival curve at the end is for 90 patients. Could somebody 
please from the survival package confirm that this behaviour is as expected or 
not - because I cannot find a way of using 'newdata' with really new data. 
Thanks in advance. DK

 x-matrix(rnorm(100*20),100,20)


time-runif(100,min=0,max=7)


status-sample(c(0,1), 100, replace = TRUE)  
 trainX-x[11:100,]  

trainTime-time[11:100]  

trainStatus-status[11:100]  

testX-x[1:10,]  
 coxph.model-
coxph(Surv(trainTime,trainStatus)~ trainX)  
 sfit- survfit(coxph.model,newdata=data.frame(testX))


dim(sfit$surv)

[1] 90 90


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival survfit with newdata

2012-05-16 Thread David Winsemius


On May 16, 2012, at 5:08 PM, Damjan Krstajic wrote:



Dear all,

I am confused with the behaviour of survfit with newdata option.


Yes. It has the same behavior as any other newdata/predict from  
regression. You need to supply a dataframe with the same names as in  
the original formula. Doesn't look as though that strategy is being  
followed. The name of the column needs to be 'trainX' since that was  
what was its name on the RHS of hte formula,  and you may want to  
specify times. If you fail to follow those rules, the function falls  
back on offering estimates from the original data.




I am using the latest version R-2-15-0. In the simple example below  
I am building a coxph model on 90 patients and trying to predict 10  
patients. Unfortunately the survival curve at the end is for 90  
patients.


As is proper with a malformed newdata argument.

Could somebody please from the survival package confirm that this  
behaviour is as expected or not - because I cannot find a way of  
using 'newdata' with really new data. Thanks in advance. DK



x-matrix(rnorm(100*20),100,20)





time-runif(100,min=0,max=7)




status-sample(c(0,1), 100, replace = TRUE)

trainX-x[11:100,]


trainTime-time[11:100]



trainStatus-status[11:100]



testX-x[1:10,]

coxph.model-

coxph(Surv(trainTime,trainStatus)~ trainX)

sfit- survfit(coxph.model,newdata=data.frame(testX))





dim(sfit$surv)

[1] 90 90



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.