On Thu, Mar 15, 2012 at 2:51 PM, Olivier Grisel <[email protected]>wrote:
> Le 15 mars 2012 10:46, Conrad Lee <[email protected]> a écrit :
> > I've got a matrix of feature vectors where all features are categorical.
> > Before I can treat these as features, they need to be binarized, as
> > described here.
> >
> > I have written my own function to solve this problem, but I'd rather use
> one
> > from scikit-learn that is both efficient and has been reviewed by others.
> > Is there any function that can do this for me? It seems like a common
> > problem--common enough that a function in scikit-learn should solve it
> for
> > me.
>
> There is a binarization transformer in the sklearn.preprocessing package:
>
> http://scikit-learn.org/stable/modules/preprocessing.html#binarization
>
>
That is not the same operation that Conrad is looking for. The link he
gave shows the following example: given a field called 'color' containing
categorical values (e.g. 'purple', 'blue', 'red'), create a new array in
which the 'color' field is replaced by three fields, say 'color#purple',
'color#blue', 'color#red', with boolean values.
Warren
There is also some related discussion in the dict vectorizer pull request:
>
> https://github.com/scikit-learn/scikit-learn/pull/686
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general