>From a theoretical point of view, yes you should one-hot-encode your categorical variables if you don't want any ordering to be implied.
Brian On 29 Mar 2017 08:40, "Andrew Howe" <ahow...@gmail.com> wrote: > My question is more along the lines of will the DT classifier falsely > infer an ordering? > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > www.andrewhowe.com > http://www.linkedin.com/in/ahowe42 > https://www.researchgate.net/profile/John_Howe12/ > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > On Wed, Mar 29, 2017 at 10:32 AM, Olivier Grisel <olivier.gri...@ensta.org > > wrote: > >> For large enough models (e.g. random forests or gradient boosted trees >> ensembles) I would definitely recommend arbitrary integer coding for >> the categorical variables. >> >> Try both, use cross-validation and see for yourself. >> >> -- >> Olivier >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn