Hi,
I am currently studying Decision Trees by using rpart package in R. I
artificially created a data set which includes the dependant variable
(y) and a few independent variables (x1, x2...). The dependant variable
y only comprises 0 and 1. 90% of y are 1 and 10% of y are 0. When I
apply rpart to it, there is no splitting at all.
I am wondering whether this is because of the "special" distribution of
y. Since the majority of y is 1 (information in the data set is small),
rpart automatically regards it as already a single class and therefore
won't proceed any further. If this understanding is correct, what I
should do if I still want rpart to do something on this data set?
Thanks a lot!
Ningwei
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.