See https://cwiki.apache.org/confluence/display/LUCENE/ScoresAsPercentages which has some broken nabble links, but is still valid.
TLDR: Scoring just doesn't work the way you think. Don't try to interpret it as an absolute value, it is a relative one. On Fri, May 28, 2021 at 1:36 PM TK Solr <tksol...@sonic.net> wrote: > > I'd like to have suggestions on changing the scoring algorithm > of MoreLikeThis. > > When I feed the identical string as the content of a document in the index > to MoreLikeThis.like("field", new StringReader(docContent)), > I get a score less than 1.0 (0.944 in one of my test cases) that I expect. > > What is the easiest way to change this so that the score is 1.0 when > all the terms in the query matches with all the terms of a document? > The score should be less than 1.0 if the query contains only a part of the > terms > from the document. (Needless to say, the score should also be less than 1.0 > if only part of the query terms are found in the document.) > > For my purpose, I don't need a sophisticated search relevancy technique > like TF-IDF. I'd like it work faster/cheaper. > > I tried using BooleanSimilarity, but that ended up returning a score above > 1.0. > Also the score is the same as long as all the terms in the query are matched. > For example, querying "quick brown fox" and "quick brown" yield the same score > against > the doc that has the famous test string. > > > TK > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org