2013/11/29 Andreas Hjortgaard Danielsen <andrea...@gmail.com>:
> Hi,
>
> It might be worth noting that Lucene uses the same implementation:
> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

Same as what? The current master or @larsmans' suggested fix?

> And Gensim has an option for choosing an addition constant (although the
> default is 0).
> https://github.com/piskvorky/gensim/blob/develop/gensim/models/tfidfmodel.py
>
> Could this be some numerical trick?

Honestly I don't remember well how we ended up in the current
implementation. I just remember that we had introduced bugs at some
points (negative values and zero division error). The current state
might still be buggy in some respect as the last bugfix change might
not be the "correct" way to do it.

-- 
Olivier

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to