Hai all,

I have a very large pandas dataframe. Below is the sample

   * Id      description*
    1        switvch for air conditioner transformer..............
    2        control tfrmr...........
    3        coling pad.................
    4        DRLG machine
    5        hair smothing kit...............

For further process, I will contruct doument-term matrix of above data
using Sckit-learn's countvectorizer

*countvec = CountVectorizer()*
*documenttermmatrix=countvec.fit_transform(  dataset['description'])*

I have to correct misspelled features in description. Replacing wrongly
spelled word with correctly spelled word  for large dataset is taking so
much of time.

So i thought of  correcting features using features list in count
vectorizer given by code

*features_names= **countvec.get_feature_names()*

*Is it possible to rename features using above list and further use it for
classification process???*

Thanks
Ranjana
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to