Turn them into strings first is by far and away the easiest solution!
Alternatively, look up the feature names in the
dict_vectorizer.feature_names_ attribute, then follow the DictVectorizer
with a OneHotEncoder where the categorical_features parameter is set.
HTH,
Joel
On 26 June 2014 17:54, Awhan Patnaik <awima...@gmail.com> wrote:
> Hello all
>
> Some of the features in my dataset have numerical values while others
> have strings and other non-numerical objects. DictVectorizer skips
> over those features which have all numerical values. How do I coerce
> it to treat them as categorical features?
>
> Thanks
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general