You should also store you OneHotTransfomer.
On 11/17/2015 01:19 AM, Startup Hire wrote:
Hi Pypers,
Hope you are doing well.
I am doing multi label classification in which my X and Y are sparse
matrices with Y properly binarized.
I am able to get done with multi label classification with 12338
features. I saved the model and tried and used it for prediction on
new data.
This is the issue I am facing:
* The number of features which are there in the model is quite
different from new data. This is because of OneHotEncoding of
categorical variables leading to different # of features on
training data vs new data.
Let me know in what are the ways this can be resolved. Should I make
any upstream changes?
Regards,
Sanant
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general