rmuir commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144943399
> I think the problem is that we have no test corpus to measure the MLT search quality, so we can't directly know if taking square roots of raw term frequency improves the search quality. I'm not against the change at all, just can't estimate the possible effects of this change. One way to do this is to expand queries of a test collection with MLT (blind feedback approach) and look at usual measurements. Easiest way to do that (if using built-in QueryDriver to run relevance tests) is to modify SimpleQQParser to incorporate MLT: https://github.com/apache/lucene/blob/main/lucene/benchmark/src/java/org/apache/lucene/benchmark/quality/utils/SimpleQQParser.java#L64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org