On Mon, 22 Aug 2016, MIKE DE LA HOZ wrote:
Hi, I am running a chaid tree using titanic dataset (see attachment) setwd("C:/Users/miguel") titanic <- read.csv("train.csv") titanic.s <- subset( titanic, select = -c(PassengerId, Name ) ) ctrl <- chaid_control(minsplit = 20, minbucket = 5, minprob = 0) chaidTitanic <- chaid(Survived ~ ., data = titanic, control = ctrl) It looks like I get the following error Error: is.factor(x) is not TRUE can you please help me here? I am not able to follow this type of error. if you can rewrite the sentence for me, It will be much appreciated
To be able to apply the chaid() function all variables (both response and predictor) need to be categorical variables, i.e., in R of class "factor".
It is not clear which variables are the culprits here because your example is not reproducible. I guess that there are at least some numeric regressor variables. Maybe the "Survived" response is also in numeric dummy coding rather than the appropriate "factor" variable.
In any case, I would recommend to use a tree model that can deal with both kinds of regressor variables. If you want something that selections split variables and split points based on statistical tests, ctree() from package "partykit" would be the obvious candidate.
Thanks ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.