> Sklearn does not implement any special treatment for categorical variables.
> You can feed any float. The question is if it would work / what it does.

I think I'm confused about a couple of aspects (that's what happens I
guess when you play with algorithms for which you don't have a
complete and firm understanding beforehand!). I assumed that
sklearn-RF's requirement for numerical inputs was just a data
representation/implementation aspect, and that once properly
transformed (i.e. using a LabelEncoder), it wouldn't matter, under the
hood, whether a predictor was categorical or numerical.

Now if I understand you well, sklearn shouldn't be able to explicitly
handle the categorical case where no order exists (i.e. categorical,
as opposed to ordinal).

But you seem to also imply that sklearn can indirectly support it
using dummy variables..

Bigger question: given that Decision Trees (in general) support pure
categorical variables.. shouldn't Random Forests also do?

>>https://github.com/benhamner/Stack-Overflow-Competition/blob/master/features.py
> I don't see where categorical variables are used in this code. Could you
> please point it out?

You're right, my bad: those are not categorical predictors.

> Not sure what this says about your dataset / features.
> If the variables don't have any ordering and the splits take arbitrary
> subsets, that would seem a bit weird to me.

In fact that's really what I observe: apart from the first of my 4
variables, which is a year, the remaining 3 are purely categorical,
with no implicit order. So that result is weird because it is not in
line with what you've been saying.

Anyway, thanks for your time and patience,

Christian

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to