Hi Peter,

I am unaware of SPINA and am downloading party now to look into that software.  
I generally have used rpart (because Salford is so expensive) but have never 
dealt with this many variables with rpart.

Do you have anyway to reduce the number of covariates before partitioning?  I 
would be concerned about the curse of dimensionality with 900 variables and 
1,000 data points.  It would be very easy to find excellent classifiers based 
on noise.  Some suggest that a split data set (train on one subset randomly 
selected from the 1,000 data points and test on the remaining) overcomes this.  
However, if X by chance due to the curse of dimensionality discriminates well 
than it will discriminate well in both the training and test data sets.

Can you reduce the 900 covariates by PCA or perhaps use an upfront stepwise 
linear discriminant analysis with a high P value threshold to retain the 
covariate (say p = .2).  We have a paper where we proposed and tested a genetic 
algorithm to reduce the number of variables in microarray data that I can send 
you in a couple of weeks when I get back to St. Louis.  It is being published 
in Sept. in the Interface Proceedings.

Good luck.
Bill Shannon
Washington Univ. School of Medicine, St. Louis 
314-704-8725

Peter Flom <[EMAIL PROTECTED]> wrote:     Tree software     I have been getting 
involved with classification trees, and have some questions regarding software. 
 My data consist of the following:
 
 about 1,000 subjects - likely to increase but not dramatically
 
 about 900 independent or predictor variables - all continuous, some highly 
correlated, all standardized and approximately normally distributed
 
 outcome which can be dichotomous or categorical, with up to 10 or so 
categories.
 
 I have been using software from R - both Torsten Hothorn's party package and 
Therneau and Atkinson's rpart - but these bog down when the tree is not 
dichotomous
 
 I have investigated Salford System's software, which is very impressive, but 
expensive, and may be beyond our budget.
 
 I've looked briefly at SPINA
 
 
 I'd appreciate any advice or references to recent reviews.
 
 Thanks
 
 Peter L. Flom, PhD
 Brainscope, Inc.
 212 263 7863 (MTW)
 212 845 4485 (Th)
 917 488 7176 (F)
 
 
  
  ---------------------------------------------- CLASS-L list. Instructions: 
http://www.classification-society.org/csna/lists.html#class-l

----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l

Reply via email to