Thank you, Jacob. Appreciate it. Regarding 'perform better', I was referring to better accuracy, precision, recall, F1 score, etc.
Thanks, Raga On Fri, Jul 21, 2017 at 2:27 PM, Jacob Schreiber <[email protected]> wrote: > Traditionally tree based methods are very good when it comes to > categorical variables and can handle them appropriately. There is a current > WIP PR to add this support to sklearn. I'm not exactly sure what you mean > that "perform better" though. Estimators that ignore the categorical aspect > of these variables and treat them as discrete will likely perform worse > than those that treat them appropriately. > > On Fri, Jul 21, 2017 at 8:11 AM, Raga Markely <[email protected]> > wrote: > >> Hello, >> >> I am wondering if there are some classifiers that perform better for >> datasets with categorical features (converted into sparse input matrix with >> pd.get_dummies())? The data for the categorical features are nominal (order >> doesn't matter, e.g. country, occupation, etc). >> >> If you could provide me some references (papers, books, website, etc), >> that would be great. >> >> Thank you very much! >> Raga >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> [email protected] >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > _______________________________________________ > scikit-learn mailing list > [email protected] > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
