>From a theoretical point of view, yes you should one-hot-encode your
categorical variables if you don't want any ordering to be implied.

Brian

On 29 Mar 2017 08:40, "Andrew Howe" <ahow...@gmail.com> wrote:

> My question is more along the lines of will the DT classifier falsely
> infer an ordering?
>
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
> J. Andrew Howe, PhD
> www.andrewhowe.com
> http://www.linkedin.com/in/ahowe42
> https://www.researchgate.net/profile/John_Howe12/
> I live to learn, so I can learn to live. - me
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
> On Wed, Mar 29, 2017 at 10:32 AM, Olivier Grisel <olivier.gri...@ensta.org
> > wrote:
>
>> For large enough models (e.g. random forests or gradient boosted trees
>> ensembles) I would definitely recommend arbitrary integer coding for
>> the categorical variables.
>>
>> Try both, use cross-validation and see for yourself.
>>
>> --
>> Olivier
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to