Hi all. 

I’m working for text classification to classify Wikipedia documents. I using a 
word count approach to extract feature from my text so I obtain a big 
vocabulary that contains all documents word (train dataset) after lemmatization 
and deleted stop word. Now I have 70000 features. I think that for this 
problems (word based) is not good to make feature selection (with SVD or PCA). 
Actual accuracy is 77%. 

Do you think that I need to do feature selection to grow up the accuracy? 

Thank you for answer. Regards. 

Luigi 



_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to