Hi All,

I am using "rpart" function to model my data:

library(rpart)
set.seed(1)
x1 = sample(c("a","b"),100,T)
x2 = runif(100)
y = ifelse((x1=="a" & x2<=0.5)|(x1=="b" & x2>0.5), 0, 1)
data = data.frame(x1,x2,y)
fit = rpart(y~x1+x2)
plot(fit); text(fit, use.n=TRUE)

Now I want to use variable "x1" for the first split. I thought of the following three ways of doing so, but none is complete:

1) Is there any way I can fix the variable used for the first split? (Or it will be great if I can fix the variable used for any particular split.)

2) If I have the data and I also have some classification rule, is there any way I can convert my rule into an "rpart" object? (so that I can use it for plotting the tree and to predict on some new data)
For example:
If I want my rule for prediction to be:
y = if (x1=="a" and x2<=0.5) or (x1=="b" and x2>0.5) then 0,  otherwise 1
How do I get the corresponding rpart object?

3) If I split my data according to the desired variable and fit different cart models on both the datas:
spl = split(data,data$x1)
fit1 = rpart(y~x1,data=data, control=list(cp=0.00001,maxdepth=1))
fit2 = rpart(y~x2, data=spl[[1]])
fit3 = rpart(y~x2, data=spl[[2]])

Is there any way I can merge fit1, fit2 and fit3 in a single rpart object? (so that I can use it for plotting the tree and to predict on some new data)

Answer to any of the three above will solve my purpose. Any hints can be helpful.

Thanks in advance,
Regards,
Utkarsh

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to