Hi All,
My question is in regards to an error generated when using randomForest in R. Is there a special way to format the data in order to avoid this error, or am I completely confused on what the error implies?
"Error in randomForest.default(m, y, ...) : Can not handle categorical predictors with more than 32 categories."
This is generated from the command line:
> credit.rf <- randomForest(V16 ~ ., data=credit, mtry=2, importance = TRUE, do.trace=100)
The data set is the credit-screening data from the UCI respository, ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data. This data consists of 690 samples and 16 attributes.
The attribute information includes:
A1: b, a. A2: continuous. A3: continuous. A4: u, y, l, t. A5: g, p, gg. A6: c, d, cc, i, j, k, m, r, q, w, x, e, aa, ff. A7: v, h, bb, j, n, z, dd, ff, o. A8: continuous. A9: t, f. A10: t, f. A11: continuous. A12: t, f. A13: g, p, s. A14: continuous. A15: continuous. A16: +,- (class attribute)
Has anyone tried randomForests in R on the credit-screening data set from the UCI repository?
Thanks in advance for any useful hints and tips,
Melanie
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html