On Mon, Jan 7, 2019 at 11:50 PM pisymbol <pisym...@gmail.com> wrote: > According to the doc (0.20.2) the coef_ variables are suppose to be shape > (1, n_features) for binary classification. Well I created a Pipeline and > performed a GridSearchCV to create a LogisticRegresion model that does > fairly well. However, when I want to rank feature importance I noticed that > my coefs_ for my best_estimator_ has 24 entries while my training data has > 22. > > What am I missing? How could coef_ > n_features? > > Just a follow-up, I am using a OneHotEncoder to encode two categoricals as part of my pipeline (I am also using an imputer/standard scaler too but I don't see how that could add features).
Could my pipeline actually add two more features during fitting? -aps
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn