Re: [Scikit-learn-general] feature names after OneHotEncoder

2015-03-06 Thread Andreas Mueller
I thought you just wanted to mask some features, but I guess that was not you intend. You could make your code robust to future changes by using the feature_indices_ attribute, while assuming that the result first has all categorical, and then all numerical values. Btw, you might have an easier

Re: [Scikit-learn-general] feature names after OneHotEncoder

2015-03-06 Thread Eustache DIEMERT
Well after a bit of tinkering it seems that OneHotEncoder has simple rules to affect columns to the output: 1) first do the categorical, in the order given by the argument, creating columns as needed by the values 2) then the numerical So a piece of code like that seems to work: fn = [] fc =

Re: [Scikit-learn-general] feature names after OneHotEncoder

2015-03-06 Thread Eustache DIEMERT
2015-03-05 16:57 GMT+01:00 Andy t3k...@gmail.com: Well, the columns after the OneHotEncoder correspond to feature values, not feature names, right? Well, for the categorical ones this is right, except that not all my features are categorical (hence the categorical_features=...) and they are

[Scikit-learn-general] feature names after OneHotEncoder

2015-03-05 Thread Eustache DIEMERT
Hi list, I have a X (np.array) with some columns containing ids. I also have a list of column names. Then I want to transform the relevant columns to be used by a logistic regression model using OneHotEncoder: X = np.loadtxt(...) # from a CSV col_names = ... # from CSV header e =