[ 
https://issues.apache.org/jira/browse/SOLR-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220854#comment-15220854
 ] 

Greg Pendlebury commented on SOLR-8812:
---------------------------------------

I don't know that what we are talking about here is a 'workaround' at all. Solr 
is doing exactly what it is being asked to do. I know it is disrupting an 
existing user base, so it warrants discussion and maybe even a 'fix'... but the 
existing user base were leaving a non-configured parameter at its default value 
(which probably didn't match their use case) and it only worked because the 
parameter was being ignored by edismax. The fact that parameter was ignored 
introduced the real bugs in SOLR-2649.

I think there has always been confusion over how this works under the hood, and 
that still continues. q.op and mm apply to two different parts of the query, 
and each of them has other factors that come into play.
 * q.op is a boolean operator, which happens pre-parse (or in the very earliest 
stages of parsing)
 * mm applies to (top level) clauses which have the SHOULD occur flag *after* 
Solr translates all the boolean operators
 * if mm is not explicitly set, the default value is determined by q.op (? I 
haven't verified this, but that is Jan's input above). The old doco says it is 
always 100% default... but I personally have always set it explicitly... no 
experience.
 * Solr translates boolean operators into occurs flags differently depending on 
the value of q.op. In particular q.op=AND causes non-intuitive generation of 
occurs flags if looked at from a purely boolean perspective.
 * mm does not make much sense at all if you think about search as a purely 
boolean query (ie. the result either matches or doesn't) instead of occurs 
flags (ie. the score of the result is either higher or lower)

So now that SOLR-2649 has come along, it slightly muddies the water because:
 * q.op is no longer hard coded to OR. Pre-patch the user could say q.op=AND, 
but it didn't do anything to the query
 * The presence of an operator no longer turns off the mm feature

*My take on the issue is that users who want to use boolean operators in 
edismax should pay attention to the mm parameter, and make sure their choice 
matches their use case*. Previously they didn't have to... but the presence of 
the boolean operators when using edismax was buggy (? debatable... it has been 
argued that it simply wasn't the use case edismax was first written for).

Having said that, IF anything was to change, I would simply play subtly with 
choosing the default value of mm. Maybe something like this:

IF (the query contains a boolean operator) AND (mm has not been explicitly set) 
THEN (mm = 0%)

It is a tweak on the work Jan did in SOLR-2649, so that instead of turning off 
mm in response to a boolean operator being present, we instead influence the 
default value. We still let users ultimately set up their parameters however 
they want though. If the user has a use case that includes both boolean 
parameters and mm logic... have fun.

> ExtendedDismaxQParser (edismax) ignores Boolean OR when q.op=AND
> ----------------------------------------------------------------
>
>                 Key: SOLR-8812
>                 URL: https://issues.apache.org/jira/browse/SOLR-8812
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 5.5
>            Reporter: Ryan Steinberg
>            Assignee: Erick Erickson
>            Priority: Blocker
>             Fix For: 6.0, 5.5.1
>
>         Attachments: SOLR-8812.patch
>
>
> The edismax parser ignores Boolean OR in queries when q.op=AND. This behavior 
> is new to Solr 5.5.0 and an unexpected major change.
> Example:
>       "q": "id:12345 OR zzzzzzzzzz",
>       "defType": "edismax",
>       "q.op": "AND",
> where "12345" is a known document ID and "zzzzzzzzzz" is a string NOT present 
> in my data
> Version 5.5.0 produces zero results:
>     "rawquerystring": "id:12345 OR zzzzzzzzzz",
>     "querystring": "id:12345 OR zzzzzzzzzz",
>     "parsedquery": "(+((id:12345 
> DisjunctionMaxQuery((text:zzzzzzzzzz)))~2))/no_coord",
>     "parsedquery_toString": "+((id:12345 (text:zzzzzzzzzz))~2)",
>     "explain": {},
>     "QParser": "ExtendedDismaxQParser"
> Version 5.4.0 produces one result as expected
>   "rawquerystring": "id:12345 OR zzzzzzzzzz",
>     "querystring": "id:12345 OR zzzzzzzzzz",
>     "parsedquery": "(+(id:12345 
> DisjunctionMaxQuery((text:zzzzzzzzzz))))/no_coord",
>     "parsedquery_toString": "+(id:12345 (text:zzzzzzzzzz))"
>     "explain": {},
>     "QParser": "ExtendedDismaxQParser"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to