On Wed, May 23, 2012 at 10:08 PM, Olivier Grisel
<[email protected]>wrote:
> 2012/5/14 JAGANADH G <[email protected]>:
> >
> >
> > On Fri, May 11, 2012 at 3:06 PM, Olivier Grisel <
> [email protected]>
> > wrote:
> >>
> >> 2012/5/10 JAGANADH G <[email protected]>:
> >> > Hi all
> >> >
> >> > Is there any way to get the TF-IDF value mapped with the word vector
> in
> >> > sklearn.
> >> >
> >> > I would like to get output like
> >> >
> >> > w1 -> TF-IDF
> >> > w2 -> TF-IDF
> >>
> >> TF is sample-dependent but the IDF weights for each feature index are
> >> stored as an array attribute named `idf_` on the fitted vectorizer
> >> along with the `vocabulary_` that gives you the mapping from words to
> >> IDF weights.
> >>
> >> See the documentation for more details:
> >>
> >>
> >>
> http://scikit-learn.org/dev/modules/feature_extraction.html#text-feature-extraction
> >>
> >
> > Thanks Olivier ,
> > I tried the same . I am pasting the code below . Am I following the
> correct
> > procedure ??
> >
> > [code]
> > from sklearn.datasets import load_files
> > categories = ["pos","neg"]
> > mov_train =
> >
> load_files("/usr/share/nltk_data/corpora/movie_reviews",categories=categories,shuffle=True,random_state=42)
> > from sklearn.feature_extraction.text import CountVectorizer
> > from sklearn.feature_extraction.text import TfidfTransformer
> >
> > cvect = CountVectorizer()
> > train_counts = cvect.fit_transform(mov_train.data)
> > tfidf_tr = TfidfTransformer(use_idf=True).fit(train_counts)
> >
> > for word,fr in zip(cvect.vocabulary_,tfidf_tr.idf_):
> > print '%r => %r' % (word, fr)
>
> Vocabulary is a dict so the iteration order is not deterministic. Instead
> do:
>
> for word, feature_idx in cvect.vocabulary_.iteritems():
> print '%r => %r' % (word, tfidf_tr.idf_[feature_idx])
>
> (untested).
>
>
Hi Olivier,
Thanks for the solution.
It works !!!
Best regards
--
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general