[ 
https://issues.apache.org/jira/browse/LUCENE-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16694443#comment-16694443
 ] 

Jim Ferenczi commented on LUCENE-8216:
--------------------------------------

{quote}

Should we require that weights are greater than or equal to 1 so that ttf is 
guaranteed to be greater than or equal to df?

{quote}

Yes this is required because we keep the max of the df and the sum of the ttf. 
I'll add a check.

 

{quote}

In BM25Query.rewrite(), I think you need to put the 'single field, single term' 
case before the 'single field' case?

{quote}

 

Good catch, thanks.

 

 

> Better cross-field scoring
> --------------------------
>
>                 Key: LUCENE-8216
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8216
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Major
>             Fix For: master (8.0)
>
>         Attachments: LUCENE-8216.patch, LUCENE-8216.patch
>
>
> I'd like Lucene to have better support for scoring across multiple fields. 
> Today we have BlendedTermQuery which tries to help there but it probably 
> tries to do too much on some aspects (handling cross-field term queries AND 
> synonyms) and too little on other ones (it tries to merge index-level 
> statistics, but not per-document statistics like tf and norm).
> Maybe we could implement something like BM25F so that queries across multiple 
> fields would retain the benefits of BM25 like the fact that the impact of the 
> term frequency saturates quickly, which is not the case with BlendedTermQuery 
> if you have occurrences across many fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to