Re: [scikit-learn] Text classification of large dataet

Joel Nothman Wed, 20 Dec 2017 12:14:58 -0800

To clarify:
You have 2.3M samples
How many features?
How many active features on average per sample?
In 7k classes: multiclass or multilabel?


Have you tried limiting the depth of the forest? Have you tried embedding
your feature space into a smaller vector (pre-trained embeddings, hashing,
lda, PCA or random projection)?

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Text classification of large dataet

Reply via email to