It may be beneficial to use some kind of query expansion or unsupervised
dimensionality reduction, as the vectors from a bag of words encoding will
probably be very sparse. Does that help?
On 30 June 2014 03:03, Abijith Kp <abijith....@gmail.com> wrote:
> Hi,
>
> Is it possible to use TfidfVectorizer to cluster very small sized texts??
> By small I mean with words less than 20.
>
> Or is there any better way to do it.
>
> Regards,
> Abijith
>
> --
> Abijith KP
> github.com/abijith-kp
> kpabijith.wordpress.com
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general