+1 LCS and its many many variants seem very practical and adaptable. I'm not sure why they haven't gotten traction. Overshadowed by GBM & random forests?
On Fri, Jul 21, 2017 at 11:52 AM, Sebastian Raschka <se.rasc...@gmail.com> wrote: > Just to throw some additional ideas in here. Based on a conversation with a > colleague some time ago, I think learning classifier systems > (https://en.wikipedia.org/wiki/Learning_classifier_system) are particularly > useful when working with large, sparse binary vectors (like from a one-hot > encoding). I am really not into LCS's, and only know the basics (read through > the first chapters of the Intro to Learning Classifier Systems draft; the > print version will be out later this year). > Also, I saw an interesting poster on a Set Covering Machine algorithm once, > which they benchmarked against SVMs, random forests and the like for > categorical (genomics data). Looked promising. > > Best, > Sebastian > > >> On Jul 21, 2017, at 2:37 PM, Raga Markely <raga.mark...@gmail.com> wrote: >> >> Thank you, Jacob. Appreciate it. >> >> Regarding 'perform better', I was referring to better accuracy, precision, >> recall, F1 score, etc. >> >> Thanks, >> Raga >> >> On Fri, Jul 21, 2017 at 2:27 PM, Jacob Schreiber <jmschreibe...@gmail.com> >> wrote: >> Traditionally tree based methods are very good when it comes to categorical >> variables and can handle them appropriately. There is a current WIP PR to >> add this support to sklearn. I'm not exactly sure what you mean that >> "perform better" though. Estimators that ignore the categorical aspect of >> these variables and treat them as discrete will likely perform worse than >> those that treat them appropriately. >> >> On Fri, Jul 21, 2017 at 8:11 AM, Raga Markely <raga.mark...@gmail.com> wrote: >> Hello, >> >> I am wondering if there are some classifiers that perform better for >> datasets with categorical features (converted into sparse input matrix with >> pd.get_dummies())? The data for the categorical features are nominal (order >> doesn't matter, e.g. country, occupation, etc). >> >> If you could provide me some references (papers, books, website, etc), that >> would be great. >> >> Thank you very much! >> Raga >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn