[ 
http://issues.apache.org/jira/browse/LUCENE-666?page=comments#action_12431541 ] 
            
Grant Ingersoll commented on LUCENE-666:
----------------------------------------

I am not sure on this, so others should definitely contribute, but here's my 
take:

The QueryParser (QP) is not really designed to handle this case and there is 
some misunderstanding as to what theuse of multiple boolean operators in a 
single clause.

I don't think the QP is designed to handle two boolean operators between 2 
terms.  Thus, the above query really doesn't make sense, but the QP doesn't 
really prevent it either.  Logically, A OR NOT B is equivalent to just 
searching for A.  However, the QP, from what I can tell, only looks at the last 
operator before the B term and builds it clause based on that, thus building 
the query A NOT B, which then yields the results as above.

I am not sure if there is something to fix here other than the documentation.  
I suppose we could throw a Parse Exception if we detected 2 or more boolean 
operators between terms, but I am not that familiar with JavaCC, so I am not 
100% sure. 

> TERM1 OR NOT TERM2 does not perform as expected
> -----------------------------------------------
>
>                 Key: LUCENE-666
>                 URL: http://issues.apache.org/jira/browse/LUCENE-666
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: QueryParser
>    Affects Versions: 2.0.0
>         Environment: Windows XP, JavaCC 4.0, JDK 1.5
>            Reporter: Dejan Nenov
>         Attachments: TestAornotB.java
>
>
> test:
>     [junit] Testsuite: org.apache.lucene.search.TestAornotB
>     [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 0.39 sec
>     [junit] ------------- Standard Output ---------------
>     [junit] Doc1 = A B C
>     [junit] Doc2 = A B C D
>     [junit] Doc3 = A   C D
>     [junit] Doc4 =   B C D
>     [junit] Doc5 =     C D
>     [junit] -------------------------------------------------
>     [junit] With query "A OR NOT B" we expect to hit
>     [junit] all documents EXCEPT Doc4, instead we only match on Doc3.
>     [junit] While LUCENE currently explicitly does not support queries of
>     [junit] the type "find docs that do not contain TERM" - this explains
>     [junit] not finding Doc5, but does not justify elimnating Doc1 and Doc2
>     [junit] -------------------------------------------------
>     [junit]  the fix shoould likely require a modification to QueryParser.jj
>     [junit]  around the method:
>     [junit]  protected void addClause(Vector clauses, int conj, int mods, 
> Query q)
>     [junit] Query:c:a -c:b hits.length=1
>     [junit] Query Found:Doc[0]= A C D
>     [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of 
> required/prohibited clause(s)
>     [junit]   0.6115718 = (MATCH) fieldWeight(c:a in 1), product of:
>     [junit]     1.0 = tf(termFreq(c:a)=1)
>     [junit]     1.2231436 = idf(docFreq=3)
>     [junit]     0.5 = fieldNorm(field=c, doc=1)
>     [junit]   0.0 = match on prohibited clause (c:b)
>     [junit]     0.6115718 = (MATCH) fieldWeight(c:b in 1), product of:
>     [junit]       1.0 = tf(termFreq(c:b)=1)
>     [junit]       1.2231436 = idf(docFreq=3)
>     [junit]       0.5 = fieldNorm(field=c, doc=1)
>     [junit] 0.6115718 = (MATCH) sum of:
>     [junit]   0.6115718 = (MATCH) fieldWeight(c:a in 2), product of:
>     [junit]     1.0 = tf(termFreq(c:a)=1)
>     [junit]     1.2231436 = idf(docFreq=3)
>     [junit]     0.5 = fieldNorm(field=c, doc=2)
>     [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of 
> required/prohibited clause(s)
>     [junit]   0.0 = match on prohibited clause (c:b)
>     [junit]     0.6115718 = (MATCH) fieldWeight(c:b in 3), product of:
>     [junit]       1.0 = tf(termFreq(c:b)=1)
>     [junit]       1.2231436 = idf(docFreq=3)
>     [junit]       0.5 = fieldNorm(field=c, doc=3)
>     [junit] Query:c:a (-c:b) hits.length=3
>     [junit] Query Found:Doc[0]= A B C
>     [junit] Query Found:Doc[1]= A B C D
>     [junit] Query Found:Doc[2]= A C D
>     [junit] 0.3057859 = (MATCH) product of:
>     [junit]   0.6115718 = (MATCH) sum of:
>     [junit]     0.6115718 = (MATCH) fieldWeight(c:a in 1), product of:
>     [junit]       1.0 = tf(termFreq(c:a)=1)
>     [junit]       1.2231436 = idf(docFreq=3)
>     [junit]       0.5 = fieldNorm(field=c, doc=1)
>     [junit]   0.5 = coord(1/2)
>     [junit] 0.3057859 = (MATCH) product of:
>     [junit]   0.6115718 = (MATCH) sum of:
>     [junit]     0.6115718 = (MATCH) fieldWeight(c:a in 2), product of:
>     [junit]       1.0 = tf(termFreq(c:a)=1)
>     [junit]       1.2231436 = idf(docFreq=3)
>     [junit]       0.5 = fieldNorm(field=c, doc=2)
>     [junit]   0.5 = coord(1/2)
>     [junit] 0.0 = (NON-MATCH) product of:
>     [junit]   0.0 = (NON-MATCH) sum of:
>     [junit]   0.0 = coord(0/2)
>     [junit] ------------- ---------------- ---------------
>     [junit] Testcase: testFAIL(org.apache.lucene.search.TestAornotB):   FAILED
>     [junit] resultDocs =A C D expected:<3> but was:<1>
>     [junit] junit.framework.AssertionFailedError: resultDocs =A C D 
> expected:<3> but was:<1>
>     [junit]     at 
> org.apache.lucene.search.TestAornotB.testFAIL(TestAornotB.java:137)
>     [junit] Test org.apache.lucene.search.TestAornotB FAILED

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to