TERM1 OR NOT TERM2 does not perform as expected -----------------------------------------------
Key: LUCENE-666 URL: http://issues.apache.org/jira/browse/LUCENE-666 Project: Lucene - Java Issue Type: Bug Components: QueryParser Affects Versions: 2.0.0 Environment: Windows XP, JavaCC 4.0, JDK 1.5 Reporter: Dejan Nenov Attachments: TestAornotB.java test: [junit] Testsuite: org.apache.lucene.search.TestAornotB [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 0.39 sec [junit] ------------- Standard Output --------------- [junit] Doc1 = A B C [junit] Doc2 = A B C D [junit] Doc3 = A C D [junit] Doc4 = B C D [junit] Doc5 = C D [junit] ------------------------------------------------- [junit] With query "A OR NOT B" we expect to hit [junit] all documents EXCEPT Doc4, instead we only match on Doc3. [junit] While LUCENE currently explicitly does not support queries of [junit] the type "find docs that do not contain TERM" - this explains [junit] not finding Doc5, but does not justify elimnating Doc1 and Doc2 [junit] ------------------------------------------------- [junit] the fix shoould likely require a modification to QueryParser.jj [junit] around the method: [junit] protected void addClause(Vector clauses, int conj, int mods, Query q) [junit] Query:c:a -c:b hits.length=1 [junit] Query Found:Doc[0]= A C D [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited clause(s) [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 1), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.0 = match on prohibited clause (c:b) [junit] 0.6115718 = (MATCH) fieldWeight(c:b in 1), product of: [junit] 1.0 = tf(termFreq(c:b)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 2), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=2) [junit] 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited clause(s) [junit] 0.0 = match on prohibited clause (c:b) [junit] 0.6115718 = (MATCH) fieldWeight(c:b in 3), product of: [junit] 1.0 = tf(termFreq(c:b)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=3) [junit] Query:c:a (-c:b) hits.length=3 [junit] Query Found:Doc[0]= A B C [junit] Query Found:Doc[1]= A B C D [junit] Query Found:Doc[2]= A C D [junit] 0.3057859 = (MATCH) product of: [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 1), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=1) [junit] 0.5 = coord(1/2) [junit] 0.3057859 = (MATCH) product of: [junit] 0.6115718 = (MATCH) sum of: [junit] 0.6115718 = (MATCH) fieldWeight(c:a in 2), product of: [junit] 1.0 = tf(termFreq(c:a)=1) [junit] 1.2231436 = idf(docFreq=3) [junit] 0.5 = fieldNorm(field=c, doc=2) [junit] 0.5 = coord(1/2) [junit] 0.0 = (NON-MATCH) product of: [junit] 0.0 = (NON-MATCH) sum of: [junit] 0.0 = coord(0/2) [junit] ------------- ---------------- --------------- [junit] Testcase: testFAIL(org.apache.lucene.search.TestAornotB): FAILED [junit] resultDocs =A C D expected:<3> but was:<1> [junit] junit.framework.AssertionFailedError: resultDocs =A C D expected:<3> but was:<1> [junit] at org.apache.lucene.search.TestAornotB.testFAIL(TestAornotB.java:137) [junit] Test org.apache.lucene.search.TestAornotB FAILED -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]