Given categorical attributes, for instance city = ['a', 'b', 'c', 'd', 'e', 'f']
With DictVectorizer(), we can transform "city" into a sparse matrix, using 1-of-k representation. But for each split, the decisionTree evaluate only one single attribute, say city == 'a' - True or False? What I want is to ask if the city is in a subset city.isin['a', 'b', 'c'] - True or False? As I know, the implementation of MLlib of spark can do this? Can we make do this within scikit-learn? Best, Rex
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general