Pulling the IDF out of Lucene is a little bit trickier, but otherwise
DictVectorizer pipelined with TfidfTransformer should be able to do this.


On 1 July 2014 16:40, Lars Buitinck <[email protected]> wrote:

> 2014-07-01 21:03 GMT+02:00 Geetu Ambwani <[email protected]>:
> > I imagine this transformer would be useful to others who use lucene for
> text
> > analysis and already have access to term vectors and have the partial
> > pipeline but might still want access to the various weighting schemes
> > available in TfidfVectorizer (ex: norm, smooth_idf, sublinear_tf etc)
>
> Why? Can't DictVectorizer do this?
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to