But note that there may be deeper, non-statistical, issues of what you mean
by "validation" here: how good must the predictions be on the validation
data? How similar or dissimilar should the validation data be to the
"training" data? To what end/population is the fitted model to be applied?
For example, AFAIK in most scientific research, a model is not considered
"validated" unless results can be substantively reproduced (??) in different
labs, sometimes with alternative methods.

Think of the 1916 (I think it was) measurements of star positions during a
total solar eclipse to "validate" Einstein's Theory of General Relativity.
My point is not to say that this kind of "validation" is appropriate for a
Cox model, but only that the issues are worth thinking about.


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Frank 
> E Harrell Jr
> Sent: Tuesday, September 28, 2004 10:11 AM
> To: Min-Han Tan
> Cc: [EMAIL PROTECTED]
> Subject: Re: [R] Validating a Cox model on an external set
> 
> Min-Han Tan wrote:
> > Good morning,
> > 
> > Sorry to trouble the list. 
> > 
> > I have a problem I hope to seek your advice on. 
> >  
> > Essentially, I am trying to 'validate' a multivariate Cox 
> proportional
> > hazards model built in a training set, by testing it on an external
> > test set. I have performed a survfit using the Cox model to predict
> > survival for the test set, and obtained individual predictions for
> > survival time, with standard error for each test sample. 
> Each of these
> > cases has an actual survival time, some censored.
> >  
> > How can we decide whether the Cox model has been validated or not?
> 
> This is what the Design package and its cph and validate.cph and 
> calibrate.cph functions are for.
> 
> >  
> > I was suggested survdiff in the survival package, but survdiff works
> > between curves; am not sure how I could use it (I have a predicted
> > curve for each curve, but no 'observed curve' - the only observation
> > is death or censoring at time x)
> > 
> > Thank you all so much! 
> >  
> > Min-Han Tan
> > Van Andel Institute
> > 
> > ______________________________________________
> > [EMAIL PROTECTED] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> > 
> 
> 
> -- 
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                       Department of Biostatistics   
> Vanderbilt University
> 
> ______________________________________________
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to