Sorry for the spam - type of '8' instead of 'a' - hard enough to follow without that - read this one below instead:
Mark Miller wrote: > Mark Miller wrote: > >>> Yeah I think you do, except each payload is only returned once. So >>> it's only the first span that hits a payload that will return it. >>> >>> So it sounds like SNQ just isn't guaranteed to be exhaustive in how it >>> enumerates the spans, eg I'll never see that 2nd occurrence of "k", >>> nor its associated payload. >>> >>> >>> >> Not only not guaranteed, but its just not going to happen - its not >> how spans match. If I say find n within 300 of m with the following: >> >> n m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m >> m m m m m m m m m m m m >> >> Only the first m will match. It will start at the left, find the n, then >> say great, an m within 300, this doc matches, we are done. There is >> not another n to start on or finish on to the right. It doesn't then >> touch the next 300 m's - just they way Doug implemented them from what I >> can tell. Its only exhaustive from the >> left - find m within 300 of n, order matters (m first) >> >> m m m m m m m m m m m m m m m m m m n >> >> This will be a bunch of spans - start at the left - the first m to n >> matches, then the second m - n matches, then the third m to n matches, >> and so on as we move right. >> >> > You can figure out what will match using the Span rules I mentioned by > the way (at least > I believe so). > > Those rules are simple (this is my current working knowledge and I don't > guarantee it - but I havn't seen it off yet) - > > 1. Only one span can start from a term. > 2. Start matching from the left and work right. > > so applying to your example: > > SpanNearQuery > SpanTermQuery term=a > SpanTermQuery term=k > > > 0:a 1:a 1:b 2:c 2:d 3:e 3:a 4:f 4:g 5:h 5:i 6:j 6:a 7:b 7:k 8:k > >> span 0 to 8 >> span 1 to 8 >> span 3 to 8 >> span 6 to 8 >> > > So first we see 0 which is an a - we draw our span because the k at 7 > is within 30: 0-8. > We move move right now, because we can't start at that term again. > Another a - and again the > k at 7 is within 30 - mark our span 1-8. Now we have to move right one > at least, but we don't > find the next a till 3 - again there is a k within 30 at 7 - mark our > span: 3-8. Now move right a > term at least - we find another a at 6 - again there is a k within 30 at > 7 - mark our span: 6-8. > Now we are done. We never needed or used the k at 8 (ends at 9) in the > Spans algorithm. > > -- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org