[ https://issues.apache.org/jira/browse/LUCENE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506703 ]
Doron Cohen commented on LUCENE-933: ------------------------------------ So an acceptable solution is: Query parser will ignore empty clauses (e.g. ' ( ) ' ) resulted from words filtering, the same as it already does for single words. A straightforward fix is for QueryParser to avoid adding null (inner) queries into (outer) clauses sets. (It makes sense, too.) However this has a side effect: For queries that became "empty" as result of filtering (stopping), QueryParser would now return null. This is an API semantics change, because applications that used to get a BooleanQuery with 0 clauses as parse result, would now get a null query. Here is a closer look on the behavior change: Original behavior: (1) parse(" ") == ParseException (2) parse("( )") == ParseException (3) parse("stop") == " " (actually a boolean query with 0 clauses) (4) parse("(stop)") == " " (actually a boolean query with 0 clauses) (5) parse("a stop b") == "a b" (6) parse("a (stop) b") == "a () b" (middle part is a boolean query with 0 clauses) (7) parse("a ((stop)) b") == "a () b" (again middle part is a boolean query with 0 clauses) Modified behavior: (3) parse("stop") == null (4) parse("(stop)") == null (6) parse("a (stop) b") == "a b" (7) parse("a ((stop)) b") == "a b" I think the modified behavior is the right one - applications can test a query for being null and realize that it is a no-op. However backwards compatibility is important - would this change break existing applications with annoying new NPEs? As an alternative, QueryParser parse() methods can be modified to return a phony empty BQ instead of returning null, for the sake of backwards compatibility. Thoughts? > QueryParser can produce empty sub BooleanQueries when Analyzer proudces no > tokens for input > ------------------------------------------------------------------------------------------- > > Key: LUCENE-933 > URL: https://issues.apache.org/jira/browse/LUCENE-933 > Project: Lucene - Java > Issue Type: Bug > Reporter: Hoss Man > Assignee: Doron Cohen > > as triggered by SOLR-261, if you have a query like this... > +foo:BBB +(yak:AAA baz:CCC) > ...where the analyzer produces no tokens for the "yak:AAA" or "baz:CCC" > portions of the query (posisbly because they are stop words) the resulting > query produced by the QueryParser will be... > +foo:BBB +() > ...that is a BooleanQuery with two required clauses, one of which is an empty > BooleanQuery with no clauses. > this does not appear to be "good" behavior. > In general, QueryParser should be smarter about what it does when parsing > encountering parens whose contents result in an empty BooleanQuery -- but > what exactly it should do in the following situations... > a) +foo:BBB +() > b) +foo:BBB () > c) +foo:BBB -() > ...is up for interpretation. I would think situation (b) clearly lends > itself to dropping the sub-BooleanQuery completely. situation (c) may also > lend itself to that solution, since semanticly it means "don't allow a match > on any queries in the empty set of queries". .... I have no idea what the > "right" thing to do for situation (a) is. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]