Hi Mikhail, You is correct, it should give an ok upper bound of scores on term queries and combinations of term queries via BooleanQuery.
On Wed, May 22, 2024 at 6:57 PM Mikhail Khludnev <m...@apache.org> wrote: > I'm trying to understand Impacts. Need help. > https://github.com/apache/lucene/issues/5270#issuecomment-1223383919 > Does it mean > advanceShallow(0) > getMaxScore(maxDoc-1) > gives a good max score estem at least for a term query? > > On Fri, May 10, 2024 at 11:21 PM Mikhail Khludnev <m...@apache.org> wrote: > >> Hello Alessandro. >> Glad to hear! >> There's not much update from the previously published link: just a tiny >> test. Guessing max tf doesn't seem really reliable. >> However, I've got another idea: >> Can't Impacts give us an exact max score like >> https://lucene.apache.org/core/9_9_1/core/org/apache/lucene/search/Scorer.html#getMaxScore(int)? >> >> I don't know if it's possible and how to do it. >> >> On Thu, May 9, 2024 at 6:11 PM Alessandro Benedetti <a.benede...@sease.io> >> wrote: >> >>> Hi Mikhail, >>> I was thinking again about this regarding Hybrid Search in Solr and the >>> current >>> https://solr.apache.org/guide/solr/latest/query-guide/function-queries.html#scale-function >>> . >>> Was there any progress on this? Any traction? >>> Sooner or later I hope to get some funds to work on this, I keep you >>> updated! >>> I agree this would be useful in Learning To Rank and Hybrid Search in >>> general. >>> The current original score feature is unlikely to be useful if not >>> normalised per an estimated maximum score. >>> >>> Cheers >>> -------------------------- >>> *Alessandro Benedetti* >>> Director @ Sease Ltd. >>> *Apache Lucene/Solr Committer* >>> *Apache Solr PMC Member* >>> >>> e-mail: a.benede...@sease.io >>> >>> >>> *Sease* - Information Retrieval Applied >>> Consulting | Training | Open Source >>> >>> Website: Sease.io <http://sease.io/> >>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter >>> <https://twitter.com/seaseltd> | Youtube >>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github >>> <https://github.com/seaseltd> >>> >>> >>> On Mon, 13 Feb 2023 at 12:47, Mikhail Khludnev <m...@apache.org> wrote: >>> >>>> Hello. >>>> Just FYI. I scratched a little prototype >>>> https://github.com/mkhludnev/likely/blob/main/src/test/java/org/apache/lucene/contrb/highly/TestLikelyReader.java#L53 >>>> To estimate maximum possible score for the query against an index: >>>> - it creates a virtual index (LikelyReader), which >>>> - contains all terms from the original index with the same docCount >>>> - matching all of these terms in the first doc (docnum=0) with the >>>> maximum termFreq (which estimating is a separate question). >>>> So, if we search over this LikelyReader we get a score estimate, which >>>> can hardly be exceeded by the same query over the original index. >>>> I suppose this might be useful for LTR as a better alternative to the >>>> query score feature. >>>> >>>> On Tue, Dec 6, 2022 at 10:02 AM Mikhail Khludnev <m...@apache.org> >>>> wrote: >>>> >>>>> Hello dev! >>>>> Users are interested in the meaning of absolute value of the score, >>>>> but we always reply that it's just relative value. Maximum score of >>>>> matched >>>>> docs is not an answer. >>>>> Ultimately we need to measure how much sense a query has in the index. >>>>> e.g. [jet OR propulsion OR spider] query should be measured like >>>>> nonsense, because the best matching docs have much lower scores than >>>>> hypothetical (and assuming absent) doc matching [jet AND propulsion AND >>>>> spider]. >>>>> Could it be a method that returns the maximum possible score if all >>>>> query terms would match. Something like stubbing postings on virtual >>>>> all_matching doc with average stats like tf and field length and kicks >>>>> scorers in? It reminds me something about probabilistic retrieval, but not >>>>> much. Is there anything like this already? >>>>> >>>>> -- >>>>> Sincerely yours >>>>> Mikhail Khludnev >>>>> >>>> >>>> >>>> -- >>>> Sincerely yours >>>> Mikhail Khludnev >>>> >>> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> > > > -- > Sincerely yours > Mikhail Khludnev > -- Adrien