A more detailed explanation of the issue was posted about a year ago, http://www.nabble.com/Possible-bug-in-SpanNearQuery-td10345758.html. I couldn't find any signs of resolution.
As a brief summary, consider a field with these terms, "two one one two" An ordered SpanNearQuery, spanNear([text:two, text:one], 1, true) yields one span, two one [0,2] An unordered SpanNearQuery, spanNear([text:two, text:one], 1, false) yields three spans, two one [0,2] one one two [1,4] one two [2,4] Neither query includes the span, "two one one" [0,3]. -- This manifests itself as a problem in my work when I want to define an inverted proximity operation. Say I want to find all instances of the word "one" that don't follow the word "two" by some slop value. My initial thought was that this query, spanNot(text:one, spanNear([text:two, text:one], 1, true)) would work. With the example string, I would have expected 0 spans returned. However, that query returns a span, "one" [2,3]. I understand now why this happens. As a result of SpanNearQuery not matching all possible spans, the SpanNotQuery operator cannot provide a logically inverted set of all possible spans. Any compound SpanQuery that is dependent on that inverted set being complete will be glaringly inaccurate. I've looked at the code enough to know that know I would have to look at it a lot longer in order to fully understand the algorithm. Is there any general interest in modifying NearSpanOrdered/NearSpanUnordered to include all possible spans? Thanks, Nathan -- View this message in context: http://www.nabble.com/SpanNearQuery%3A-All-matches-within-slop-tp19191359p19191359.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]