Re: Adding vs multiplicating scores when implementing "recency"

Michael Sokolov Fri, 17 Sep 2021 12:40:47 -0700

ah, thanks for the explanation

On Fri, Sep 17, 2021 at 10:11 AM Adrien Grand <[email protected]> wrote:
>
> This is one requirement indeed. Since WAND reasons about partially
> evaluated documents, it also requires that matching one more clause makes
> the overall score higher, which is why we introduced the requirement that
> scores must be positive in 8.0. For multiplication, this would require
> scores that are greater than 1.
>
> If someone really wanted to multiply scores, the easiest way might be to
> create a query wrapper that takes the log of the scores of the wrapped
> query, and rely on log(a)+log(b) = log(a * b).
>
> Le ven. 17 sept. 2021 à 14:47, Michael Sokolov <[email protected]> a
> écrit :
>
> > Not advocating any particular approach here, just curious: could BMW
> > also function in the presence of a doc-score (like recency) that is
> > multiplied? My vague understanding is that as long as the scoring
> > formula is monotonic in all of its inputs, and we have block-encoded
> > the inputs, then we could compute a max score for a block?
> >
> > On Thu, Sep 16, 2021 at 12:41 PM Adrien Grand <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > You are correct that the contribution would be additive in that case. We
> > > don't provide an easy way to make the contribution multiplicative.
> > >
> > > There is some debate about what is the best way to combine BM25 scores
> > with
> > > query-independent features, though in the discussions I've seen
> > > contributions were summed up and the debate was more about whether they
> > > should be normalized or not.
> > >
> > > How much recency impacts ranking indeed depends on the number of terms
> > and
> > > how frequent these terms are. One way that I'm interpreting the fact that
> > > not everyone recommends normalizing scores is that this way the query
> > score
> > > dominates when the query is looking for something very specific, because
> > it
> > > includes many terms or because it uses very specific terms - which may
> > be a
> > > feature. This approach also works well for Lucene since dynamic pruning
> > via
> > > Block-Max WAND keeps working when query-independent features are
> > > incorporated into the final score, which helps figure out the top hits
> > > without having to collect all matches.
> > >
> > > On Thu, Sep 16, 2021 at 5:40 PM Nicolás Lichtmaier
> > > <[email protected]> wrote:
> > >
> > > > On March I've asked a question here that go no answers at all. As it
> > > > still something that I'd very much like to know I'll ask again.
> > > >
> > > > To implement "recency" into a search you would add a boolean clause
> > with
> > > > a LongPoint.newDistanceFeatureQuery(), right? But that's additive,
> > > > meaning that this recency will impact different for searches with
> > > > different number of terms, right? With more terms the recency component
> > > > contribution to score will be more and more "diluted". However... I
> > only
> > > > see examples using this way of doing, and I would need to do something
> > > > weird to implement a multiplicative change of the score... Am I missing
> > > > something?
> > > >
> > > > Thanks!
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [email protected]
> > > > For additional commands, e-mail: [email protected]
> > > >
> > > >
> > >
> > > --
> > > Adrien
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Adding vs multiplicating scores when implementing "recency"

Reply via email to