Given categorical attributes, for instance
city = ['a', 'b', 'c', 'd', 'e', 'f']

With DictVectorizer(), we can transform "city" into a sparse matrix, using
1-of-k representation.

But for each split, the decisionTree evaluate only one single attribute, say
city == 'a' - True or False?

What I want is to ask if the city is in a subset
city.isin['a', 'b', 'c'] - True or False?


As I know, the implementation of MLlib of spark can do this?

Can we make do this within scikit-learn?


Best,
Rex
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to