Thank you, Jacob. Appreciate it. Regarding 'perform better', I was referring to better accuracy, precision, recall, F1 score, etc.
Thanks, Raga On Fri, Jul 21, 2017 at 2:27 PM, Jacob Schreiber <jmschreibe...@gmail.com> wrote: > Traditionally tree based methods are very good when it comes to > categorical variables and can handle them appropriately. There is a current > WIP PR to add this support to sklearn. I'm not exactly sure what you mean > that "perform better" though. Estimators that ignore the categorical aspect > of these variables and treat them as discrete will likely perform worse > than those that treat them appropriately. > > On Fri, Jul 21, 2017 at 8:11 AM, Raga Markely <raga.mark...@gmail.com> > wrote: > >> Hello, >> >> I am wondering if there are some classifiers that perform better for >> datasets with categorical features (converted into sparse input matrix with >> pd.get_dummies())? The data for the categorical features are nominal (order >> doesn't matter, e.g. country, occupation, etc). >> >> If you could provide me some references (papers, books, website, etc), >> that would be great. >> >> Thank you very much! >> Raga >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn