> Sklearn does not implement any special treatment for categorical variables. > You can feed any float. The question is if it would work / what it does.
I think I'm confused about a couple of aspects (that's what happens I guess when you play with algorithms for which you don't have a complete and firm understanding beforehand!). I assumed that sklearn-RF's requirement for numerical inputs was just a data representation/implementation aspect, and that once properly transformed (i.e. using a LabelEncoder), it wouldn't matter, under the hood, whether a predictor was categorical or numerical. Now if I understand you well, sklearn shouldn't be able to explicitly handle the categorical case where no order exists (i.e. categorical, as opposed to ordinal). But you seem to also imply that sklearn can indirectly support it using dummy variables.. Bigger question: given that Decision Trees (in general) support pure categorical variables.. shouldn't Random Forests also do? >>https://github.com/benhamner/Stack-Overflow-Competition/blob/master/features.py > I don't see where categorical variables are used in this code. Could you > please point it out? You're right, my bad: those are not categorical predictors. > Not sure what this says about your dataset / features. > If the variables don't have any ordering and the splits take arbitrary > subsets, that would seem a bit weird to me. In fact that's really what I observe: apart from the first of my 4 variables, which is a year, the remaining 3 are purely categorical, with no implicit order. So that result is weird because it is not in line with what you've been saying. Anyway, thanks for your time and patience, Christian ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general