[ 
https://issues.apache.org/jira/browse/LUCENE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15329457#comment-15329457
 ] 

Ferenczi Jim commented on LUCENE-7337:
--------------------------------------

??A simple fix would be to replace the empty boolean query produced by the 
multi term query with a MatchNoDocsQuery but I am not sure that it's the best 
way to fix.??

I am not sure of this statement anymore. Conceptually a MatchNoDocsQuery and a 
BooleanQuery with no clause are similar. Though what I proposed assumed that 
the value for normalization of the MatchNoDocsQuery is 1. I think that doing 
this would bring confusion since this value is supposed to reflect the max 
score that the query can get (which is 0 in this case). Currently a boolean 
query or a disjunction query with no clause return 0 for the normalization. I 
think it's the expected behavior even though this breaks the distributed case 
as explained in my previous comment. 
For empty queries that are the result of an expansion (multi term query) maybe 
we could add yet another special query,  something like MatchNoExpansionQuery 
that would use a ConstantScoreWeight ? I am proposing this because this would 
make the distinction between a query that match no documents no matter what the 
context is and a query that match no documents because of the context (useful 
for the distributed case).

> MultiTermQuery are sometimes rewritten into an empty boolean query
> ------------------------------------------------------------------
>
>                 Key: LUCENE-7337
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7337
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ferenczi Jim
>            Priority: Minor
>
> MultiTermQuery are sometimes rewritten to an empty boolean query (depending 
> on the rewrite method), it can happen when no expansions are found on a fuzzy 
> query for instance.
> It can be problematic when the multi term query is boosted. 
> For instance consider the following query:
> `((title:bar~1)^100 text:bar)`
> This is a boolean query with two optional clauses. The first one is a fuzzy 
> query on the field title with a boost of 100. 
> If there is no expansion for "title:bar~1" the query is rewritten into:
> `(()^100 text:bar)`
> ... and when expansions are found:
> `((title:bars | title:bar)^100 text:bar)`
> The scoring of those two queries will differ because the normalization factor 
> and the norm for the first query will be equal to 1 (the boost is ignored 
> because the empty boolean query is not taken into account for the computation 
> of the normalization factor) whereas the second query will have a 
> normalization factor of 10,000 (100*100) and a norm equal to 0.01. 
> This kind of discrepancy can happen in a single index because the expansions 
> for the fuzzy query are done at the segment level. It can also happen when 
> multiple indices are requested (Solr/ElasticSearch case).
> A simple fix would be to replace the empty boolean query produced by the 
> multi term query with a MatchNoDocsQuery but I am not sure that it's the best 
> way to fix. WDYT ?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to