subject:"Re\: \[Scikit\-learn\-general\] Scaling a Subset of Features in SKLEARN"

Re: [Scikit-learn-general] Scaling a Subset of Features in SKLEARN

2015-03-02 Thread Sebastian Raschka

Hi, Jason, like Andreas said, you really have to be careful with categorical features. I think the one-hot-encoder is more for nominal features though, I would handle ordinal ones differently: E.g., if you have "sizes" like "M", "L", "S", "XL", I would encode them as ["M", "L", "S", "XL"] -> [

Re: [Scikit-learn-general] Scaling a Subset of Features in SKLEARN

2015-03-02 Thread Andy

Hi Jason. We don't have any support for groups or types of features currently, sorry. And you do need to convert all categorical features to one-hot encoded features for use with sklearn. The underlying issue is that we use numpy arrays as our main data structure, and they are not very easy to