thank you so much Eric and Morus, I have a clear idea now how it works. i will try to implement a custom code that adds the parenthesis to boolean expressions with some rules about operator precedence.
Omar -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 02, 2005 6:26 AM To: Lucene Users List Subject: Re: help with boolean expression I'm deep into implementing a custom (not generalizable, sorry) query parser and am evaluating this very issue now. Lucene indeed does some funny stuff with boolean operators. Output the toString of your resultant Query's to see the details, or have a look at the Bugzilla issue that Morus mentions below. First some background: BooleanQuery clauses each have their own set of required/optional/prohibited attributes. Putting operators between them is awkward in the Lucene sense because the parser has to set each clause individually, not in relation to another one. QueryParser takes the most recent operator and applies it to both the clause before and after, and there is no sense of operator precedence (such as in a math expression like 1 + 2 * 3). In the case of A AND B OR C, when the AND is encountered, it sets the required attribute for both A and B, but then when the OR is encountered it sets the optional attribute on B and C, stepping on the previous required flag for B. And similarly with A OR B AND C. I agree that the current behavior is awkward. Is it worth breaking backwards compatibility to correct this with the patch applied? As for the default operator, it is not coming into play in your expression examples because there all clauses have an explicit conjunction that sets the flag. The default operator comes into play for an expression like A B AND C and would be used to set the flag on the A clause. Erik On Mar 1, 2005, at 9:12 AM, Omar Didi wrote: > I found something kind fo weird about the way lucene interprets > boolean expressions wihout parenthesis. > when i run the query A AND B OR C, it returns only the documents that > have A(in other words as if the query was just the term A). > when I run the query A OR B AND C, it returns only the documents that > have B AND C(as if teh query was just B AND C ). I set the default > operator in my application to be AND. > can anyone explain this behavior, thanks. > > -----Original Message----- > From: Morus Walter [mailto:[EMAIL PROTECTED] > Sent: Monday, February 28, 2005 2:40 AM > To: Lucene Users List > Subject: Re: help with boolean expression > > > Omar Didi writes: >> I have a problem understanding how would lucene iterpret this boolean >> expression : A AND B OR C . >> it neither return the same count as when I enter (A AND B) OR C nor A >> AND (B OR C). >> if anyone knows how it is interpreted i would be thankful. >> thanks > > A AND B OR C creates a query that requires A and B. C influcenes the > score, but is neither sufficient nor required for a match. > > IMO query parser is broken for queries mixing AND and OR without > explicit > braces. > My favorite sample is `a AND b OR c AND d' which equals `a AND b AND c > AND d' > in query parser. > > I suggested a patch some time ago, but it's still pending in bugzilla. > http://issues.apache.org/bugzilla/show_bug.cgi?id=25820 > > Don't know if it's still usable with current sources. > > Morus > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]