Re: [Scikit-learn-general] Get TF-IDF mapped with associated word vector

JAGANADH G Mon, 14 May 2012 10:48:11 -0700

On Fri, May 11, 2012 at 3:06 PM, Olivier Grisel <[email protected]>wrote:


> 2012/5/10 JAGANADH G <[email protected]>:
> > Hi all
> >
> > Is there any way to get the TF-IDF value mapped with the word vector in
> > sklearn.
> >
> > I would like to get output like
> >
> > w1 -> TF-IDF
> > w2 -> TF-IDF
>
> TF is sample-dependent but the IDF weights for each feature index are
> stored as an array attribute named `idf_` on the fitted vectorizer
> along with the `vocabulary_` that gives you the mapping from words to
> IDF weights.
>
> See the documentation for more details:
>
>
> http://scikit-learn.org/dev/modules/feature_extraction.html#text-feature-extraction
>
>
Thanks Olivier ,
I tried the same . I am pasting the code below . Am I following the correct
procedure ??

[code]
from sklearn.datasets import load_files
categories = ["pos","neg"]
mov_train =
load_files("/usr/share/nltk_data/corpora/movie_reviews",categories=categories,shuffle=True,random_state=42)
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer

cvect = CountVectorizer()
train_counts = cvect.fit_transform(mov_train.data)
tfidf_tr = TfidfTransformer(use_idf=True).fit(train_counts)

for word,fr in zip(cvect.vocabulary_,tfidf_tr.idf_):
    print '%r => %r' % (word, fr)

[\code]
-- 
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Get TF-IDF mapped with associated word vector

Reply via email to