Thanks, Sebastian. It's great to know that it works, just need to do one-hot-encoding first.
I have mixed data type (continuous and categorical). Should I tree. DecisionTreeClassifier() or tree.DecisionTreeRegressor()? I'm guessing tree.DecisionTreeClassifier()? Best, Mike On Fri, Sep 13, 2019 at 11:59 PM Sebastian Raschka < m...@sebastianraschka.com> wrote: > Hi, > > if you have the category "car" as shown in your example, this would > effectively be something like > > BMW=0 > Toyota=1 > Audi=2 > > Sure, the algorithm will execute just fine on the feature column with > values in {0, 1, 2}. However, the problem is that it will come up with > binary rules like x_i>= 0.5, x_i>= 1.5, and x_i>= 2.5. I.e., it will treat > it is a continuous variable. > > What you can do is to encode this feature via one-hot encoding -- > basically extend it into 2 (or 3) binary variables. This has it's own > problems (if you have a feature with many possible values, you will end up > with a large number of binary variables, and they may dominate in the > resulting tree over other feature variables). > > In any case, I guess this is what > > > "scikit-learn implementation does not support categorical variables for > now". > > > means ;). > > Best, > Sebastian > > > On Sep 13, 2019, at 9:38 PM, C W <tmrs...@gmail.com> wrote: > > > > Hello all, > > I'm very confused. Can the decision tree module handle both continuous > and categorical features in the dataset? In this case, it's just CART > (Classification and Regression Trees). > > > > For example, > > Gender Age Income Car Attendance > > Male 30 10000 BMW Yes > > Female 35 9000 Toyota No > > Male 50 12000 Audi Yes > > > > According to the documentation > https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-5-c5-0-and-cart, > it can not! > > > > It says: "scikit-learn implementation does not support categorical > variables for now". > > > > Is this true? If not, can someone point me to an example? If yes, what > do people do? > > > > Thank you very much! > > > > > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn