Re: [R] Odd results from rpart classification tree

2017-05-15 Thread Marshall, Jonathan
Thanks Terry! I managed to figure that out shortly after posting (as is the way!) Adding an additional covariate that splits below one of the x branches but not the other and means the class proportion to go over 0.5 means the x split is retained. However, I now have another conundrum, this

Re: [R] Odd results from rpart classification tree

2017-05-15 Thread Therneau, Terry M., Ph.D.
You are mixing up two of the steps in rpart. 1: how to find the best candidate split and 2: evaluation of that split. With the "class" method we use the information or Gini criteria for step 1. The code finds a worthwhile candidate split at 0.5 using exactly the calculations you outline.

[R] Odd results from rpart classification tree

2017-05-15 Thread Marshall, Jonathan
The following code produces a tree with only a root. However, clearly the tree with a split at x=0.5 is better. rpart doesn't seem to want to produce it. Running the following produces a tree with only root. y <- c(rep(0,65),rep(1,15),rep(0,20)) x <- c(rep(0,70),rep(1,30)) f <- rpart(y ~ x,