Hai all,
I am doing text classification. I have around 10 million data to be
classified to around 7k category.
Below is the code I am using
*# Importing the libraries*
*import pandas as pd*
*import nltk*
*from nltk.corpus import stopwords*
*from nltk.tokenize import word_tokenize*
*from
Hai all,
Thank you for your suggestions.
But I am still getting *memory error* while doing feature selection
*fs = feature_selection.SelectPercentile(feature_selection.chi2,
percentile=20)*
*documenttermmatrix1 = fs.fit_transform(documenttermmatrix,y1)*
*documenttermmatrix* will be of shape
Hai all,
I have a very large pandas dataframe. Below is the sample
* Id description*
1switvch for air conditioner transformer..
2control tfrmr...
3coling pad.
4DRLG machine
5hair smothing