A bit more concretely, have a look at this class:
http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html

It is a transformer, so you can apply it to any matrix (that doesn't mean
it makes sense, just that you can):

# Create original matrix
X = create_original_matrix_somehow()
# Setup a TfidfTransformer
tfidf = TfidfTransformer()
X_tfidf = tfidf.fit_transform(X)


X_tfidf will then give you the tfidf values.



On 1 July 2014 08:05, Joel Nothman <joel.noth...@gmail.com> wrote:

> It may be beneficial to use some kind of query expansion or unsupervised
> dimensionality reduction, as the vectors from a bag of words encoding will
> probably be very sparse. Does that help?
>
>
> On 30 June 2014 03:03, Abijith Kp <abijith....@gmail.com> wrote:
>
>> Hi,
>>
>> Is it possible to use TfidfVectorizer to cluster very small sized
>> texts?? By small I mean with words less than 20.
>>
>> Or is there any better way to do it.
>>
>> Regards,
>> Abijith
>>
>> --
>> Abijith KP
>> github.com/abijith-kp
>> kpabijith.wordpress.com
>>
>>
>> ------------------------------------------------------------------------------
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to