[ http://issues.apache.org/jira/browse/LUCENE-569?page=comments#action_12383404 ]
paul.elschot commented on LUCENE-569: ------------------------------------- Hoss, I'm afraid you've uncovered a bug in NearSpans.java for the ordered case. The test case testNearSpansSkipToLikeNext() uses this test data: doc 0: w1 w2 w3 .. .. doc 1: w1 w3 w2 w3 .. and an ordered SpanNearQuery with slop 1 for "w1 w2 w3" should match doc 0 and doc 1 The test first does a skipTo(0) on the NearSpans which succeeds to match doc 0. Then it tries skipTo(1) on the NearSpans, which should succeed, but fails, because NearSpans first does skipTo(1) on the Spans for the query terms, which puts these term spans at doc 1: w1 w3 w2 (as expected) but this does not match because it's not ordered. The NearSpans then tries a next() on itself, which starts by doing next() on the term spans for w1 in NearSpans.java near line 146: more = min().next(); // trigger further scanning However, in the ordered case, it should have advanced the first non ordered term, here w3, and so it misses the match: doc 1: w1 .. w2 w3 .. I would recommend to use the alternative NearSpans from LUCENE 413 mentioned above to fix this. The NearSpansOrdered there differs from the current version in that it does not match overlapping subspans, but it passes all current test cases including TestNearSpans here. Overlaps between Spans can occur when SpanNearQueries are nested and/or when multiple terms are indexed on the same position. In case this ordered non overlapping matching becomes an issue, it can always be fixed later. The NearSpansUnordered there is just like the current NearSpans, only simplified, and this matches overlapping subspans. > NearSpans skipTo bug > -------------------- > > Key: LUCENE-569 > URL: http://issues.apache.org/jira/browse/LUCENE-569 > Project: Lucene - Java > Type: Bug > Components: Search > Reporter: Hoss Man > Attachments: TestNearSpans.java > > NearSpans appears to have a bug in skipTo that causes it to skip over some > matching documents completely. I discovered this bug while investigating > problems with SpanWeight.explain, but as far as I can tell the Bug is not > specific to Explanations ... it seems like it could potentially result in > incorrect matching in some situations where a SpanNearQuery is nested in > another query such thatskipTo will be used ... I tried to create a high level > test case to exploit the bug when searching, but i could not. TestCase > exploiting the class using NearSpan and SpanScorer will follow... -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]