>From a quick look, your suggestion of passing the term frequency to
TFIDFSimilarity#tf makes sense.

Would you like to contribute this change? You can find contributing
guidelines here:
https://github.com/apache/lucene/blob/main/CONTRIBUTING.md.

On Thu, Mar 31, 2022 at 11:46 PM Petko Minkov <pmin...@gmail.com> wrote:
>
> Hi,
>
> I was looking at Lucene's code for MoreLikeThis, specifically this line:
> https://github.com/apache/lucene/blob/69b040fc6292ac47d7f7fc8bc3b7fd601794e54b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java#L640
>
> It looks like in ClassicSimilarity, TF is a square root, but in the code TF
> is used without the ClassicSimilarity::tf() function called. Is that a bug
> - it will make TF have a disproportionately higher weight compared to IDF?
>
> --Petko



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to