Well, the columns after the OneHotEncoder correspond to feature values, not feature names, right? There is ``feature_indices_`` which maps each feature to a range of features in the encoded matrix. The features in the input matrix don't really have names in scikit-learn, as they are represented only as numpy matrices. So you need to keep track of the indices of each feature. That shouldn't be too hard, though.

Why don't you select the features before the encoding? Or do you want to exclude some values?


On 03/05/2015 05:55 AM, Eustache DIEMERT wrote:
Hi list,

I have a X (np.array) with some columns containing ids. I also have a list of column names. Then I want to transform the relevant columns to be used by a logistic regression model using OneHotEncoder:

>>> X = np.loadtxt(...) # from a CSV
>>> col_names = ... # from CSV header
>>> e = OneHotEncoder(categorical_features=id_columns)
>>> Xprime = e.fit_transform(X)

But then I don't know how to deduce the names of the columns in the new matrix :(

Ideally I would want the same as DictVectorizer which has a feature_names_ member.

Anyone already had this problem ?

Eustache


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to