On Tue, 21 Jan 2003, Doug Kitch wrote: > Hello. I am not sure if you can help me or not but I have a dataset with > N ~ 4000 with binary response and p ~ 0.08, regardless of how many or > how few variables I offer I get the following message: 'Error in > rpart(formula, method="class"): No splits could be created Dumped.' If I > run tree with the same dataset (no missing data) in S I get results. Is > there a problem with large datasets in rpart?
If there were it would not be relevant: 4000 is not close to `large'. I suspect you ought to be using losses with such a skewed binary response, and am not surprised that no single split is effective. ?rpart.control should help you. > Also, do you happen to know the parameter options which > will make rpart and tree act the same. I am wondering if > this is possible since I have no missing data. It's not exactly possible, but look in MASS4 for some comparisons. Given that tree in S does not do what it is documented to do, it would be hard to reproduce, but tree in R comes pretty close to tree in S's documented behaviour. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
