Hello - I was just wondering if there was a way to improve performance on the one-hot encoder. Or, is there any plans to do so in the future? I am working with a matrix that will ultimately have 20 million categorical variables, and my bottleneck is the one-hot encoder.
Let me know if this isn't the place to inquire. My code is very simple when using the encoder, but I cut and pasted it here for completeness. enc = OneHotEncoder(sparse=True) Xtrain = enc.fit_transform(tiledata) Thanks, Sarah
_______________________________________________ scikit-learn mailing list email@example.com https://mail.python.org/mailman/listinfo/scikit-learn