I consider this a clustering problem (aren't all problems clustering 
problems?).  I have been trying to find a solution but can't find anything more 
sophisticated than pairwise t-tests which is less than optimal.

The problem we are attacking is the following.  In cancer epidemiology survival 
curves are estimated for different strata (i.e., different curves for different 
tumor types by tumor grade by gender by age category, etc.).  Rob Culverhouse 
and I have been publishing work on non-linear modeling in genetics and want to 
apply it to the analysis of this type of cancer survival data.

We are starting with lung cancer data (several 10's of thousands of records) 
with survival time/censoring as well as four tumor types (e.g., adenocacinoma, 
small cell) and 5 tumor grades (grades I - IV and unknown) giving us a 4 x 5 
table.  Within each cell is a survival curve.

We would like to collapse these 20 cells into a smaller number such that cells 
collapsed together have equal survival functions.

Ideally I would like an analogous method to multiple comparisons in post-hoc 
anova or G^2 statistic (?) in log-linear modeling.

Any hints would be appreciated.

Thanks
Bill





----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l
 

Reply via email to