Re: Adding vs multiplicating scores when implementing "recency"

Adrien Grand Fri, 17 Sep 2021 07:11:23 -0700

This is one requirement indeed. Since WAND reasons about partially
evaluated documents, it also requires that matching one more clause makes
the overall score higher, which is why we introduced the requirement that
scores must be positive in 8.0. For multiplication, this would require
scores that are greater than 1.


If someone really wanted to multiply scores, the easiest way might be to
create a query wrapper that takes the log of the scores of the wrapped
query, and rely on log(a)+log(b) = log(a * b).

Le ven. 17 sept. 2021 à 14:47, Michael Sokolov <[email protected]> a
écrit :

> Not advocating any particular approach here, just curious: could BMW
> also function in the presence of a doc-score (like recency) that is
> multiplied? My vague understanding is that as long as the scoring
> formula is monotonic in all of its inputs, and we have block-encoded
> the inputs, then we could compute a max score for a block?
>
> On Thu, Sep 16, 2021 at 12:41 PM Adrien Grand <[email protected]> wrote:
> >
> > Hello,
> >
> > You are correct that the contribution would be additive in that case. We
> > don't provide an easy way to make the contribution multiplicative.
> >
> > There is some debate about what is the best way to combine BM25 scores
> with
> > query-independent features, though in the discussions I've seen
> > contributions were summed up and the debate was more about whether they
> > should be normalized or not.
> >
> > How much recency impacts ranking indeed depends on the number of terms
> and
> > how frequent these terms are. One way that I'm interpreting the fact that
> > not everyone recommends normalizing scores is that this way the query
> score
> > dominates when the query is looking for something very specific, because
> it
> > includes many terms or because it uses very specific terms - which may
> be a
> > feature. This approach also works well for Lucene since dynamic pruning
> via
> > Block-Max WAND keeps working when query-independent features are
> > incorporated into the final score, which helps figure out the top hits
> > without having to collect all matches.
> >
> > On Thu, Sep 16, 2021 at 5:40 PM Nicolás Lichtmaier
> > <[email protected]> wrote:
> >
> > > On March I've asked a question here that go no answers at all. As it
> > > still something that I'd very much like to know I'll ask again.
> > >
> > > To implement "recency" into a search you would add a boolean clause
> with
> > > a LongPoint.newDistanceFeatureQuery(), right? But that's additive,
> > > meaning that this recency will impact different for searches with
> > > different number of terms, right? With more terms the recency component
> > > contribution to score will be more and more "diluted". However... I
> only
> > > see examples using this way of doing, and I would need to do something
> > > weird to implement a multiplicative change of the score... Am I missing
> > > something?
> > >
> > > Thanks!
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> >
> > --
> > Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Adding vs multiplicating scores when implementing "recency"

Reply via email to