[ 
http://issues.apache.org/jira/browse/LUCENE-569?page=comments#action_12383404 ] 

paul.elschot commented on LUCENE-569:
-------------------------------------

Hoss,

I'm afraid you've uncovered a bug in NearSpans.java for the ordered case.
The test case testNearSpansSkipToLikeNext() uses this test data:
doc 0: w1 w2 w3 .. ..
doc 1: w1 w3 w2 w3 ..
and an ordered SpanNearQuery with slop 1 for "w1 w2 w3" should match doc 0 and 
doc 1
The test first does a skipTo(0) on the NearSpans which succeeds to match doc 0.
Then it tries skipTo(1) on the NearSpans, which should succeed, but fails, 
because
NearSpans first does skipTo(1) on the Spans for the query terms,
which puts these term spans at
doc 1: w1 w3 w2
(as expected) but this does not match because it's not ordered.
The NearSpans then tries a next() on itself, which starts by doing next() on 
the term spans
for w1 in NearSpans.java near line 146:
      more = min().next();                        // trigger further scanning
However, in the ordered case, it should have advanced the first non ordered 
term,
here w3, and so it misses the match:
doc 1: w1 .. w2 w3 ..

I would recommend to use the alternative NearSpans from LUCENE 413 mentioned 
above
to fix this. The NearSpansOrdered there differs from the current version in 
that it does not
match overlapping subspans, but it passes all current test cases including 
TestNearSpans here.
Overlaps between Spans can occur when SpanNearQueries are nested and/or when 
multiple
terms are indexed on the same position.
In case this ordered non overlapping matching becomes an issue, it can always 
be fixed later.
The NearSpansUnordered there is just like the current NearSpans, only 
simplified, and this
matches overlapping subspans.


> NearSpans skipTo bug
> --------------------
>
>          Key: LUCENE-569
>          URL: http://issues.apache.org/jira/browse/LUCENE-569
>      Project: Lucene - Java
>         Type: Bug

>   Components: Search
>     Reporter: Hoss Man
>  Attachments: TestNearSpans.java
>
> NearSpans appears to have a bug in skipTo that causes it to skip over some 
> matching documents completely.  I discovered this bug while investigating 
> problems with SpanWeight.explain, but as far as I can tell the Bug is not 
> specific to Explanations ... it seems like it could potentially result in 
> incorrect matching in some situations where a SpanNearQuery is nested in 
> another query such thatskipTo will be used ... I tried to create a high level 
> test case to exploit the bug when searching, but i could not.  TestCase 
> exploiting the class using NearSpan and SpanScorer will follow...

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to