[GitHub] [lucene] rmuir commented on pull request #940: Use similarity.tf() in MoreLikeThis

GitBox Thu, 02 Jun 2022 07:37:54 -0700


rmuir commented on PR #940:
URL: https://github.com/apache/lucene/pull/940#issuecomment-1144943399


   > I think the problem is that we have no test corpus to measure the MLT 
search quality, so we can't directly know if taking square roots of raw term 
frequency improves the search quality. I'm not against the change at all, just 
can't estimate the possible effects of this change.
   
   One way to do this is to expand queries of a test collection with MLT (blind 
feedback approach) and look at usual measurements. Easiest way to do that (if 
using built-in QueryDriver to run relevance tests) is to modify SimpleQQParser 
to incorporate MLT: 
https://github.com/apache/lucene/blob/main/lucene/benchmark/src/java/org/apache/lucene/benchmark/quality/utils/SimpleQQParser.java#L64


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on pull request #940: Use similarity.tf() in MoreLikeThis

Reply via email to