You should also store you OneHotTransfomer.

On 11/17/2015 01:19 AM, Startup Hire wrote:
Hi Pypers,

Hope you are doing well.

I am doing multi label classification in which my X and Y are sparse matrices with Y properly binarized.

I am able to get done with multi label classification with 12338 features. I saved the model and tried and used it for prediction on new data.

This is the issue I am facing:

  * The number of features which are there in the model is quite
    different from new data. This is because of OneHotEncoding of
    categorical variables leading to different # of features on
    training data vs new data.

Let me know in what are the ways this can be resolved. Should I make any upstream changes?
Regards,
Sanant



------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to