[EMAIL PROTECTED] wrote: > Hello, > > I'm trying to find out the optimal number of splits (mtry parameter) > for a randomForest classification. The classification is binary and > there are 32 explanatory variables (mostly factors with each up to 4 > levels but also some numeric variables) and 575 cases. > > I've seen that although there are only 32 explanatory variables the > best classification performance is reached when choosing mtry=80. How > is it possible that more variables can used than there are in columns > the data frame?
If some of the variables are factors, dummy variables are generated and you get a larger number of variables in the later process. Uwe Ligges > thanks for your help + kind regards, > > Arne > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the > posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html