Hi,

how can I properly handle categorical values in scikit-learn?
https://stackoverflow.com/questions/45727934/pandas-categories-new-levels?noredirect=1#comment78424496_45727934


goals

   - scikit-learn syle fit/transform methods to encode labels of
   categorical features of X
   - should handle unseen labels
   - should be faster than running a label encoder manually for each fold
   and manually checking if the label already was seen in the training data
   i.e. what I currently do (
   
https://stackoverflow.com/questions/45727934/pandas-categories-new-levels?noredirect=1#comment78424496_45727934
which
   links to https://gist.github.com/geoHeil/5caff5236b4850d673b2c9b0799dc2ce
   )
   - only some columns are categorical, and only these should be converted


Regards,
Georg
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to