On Thursday 05 January 2006 19:33, Marc Hadfield wrote: > > Thanks Erik, Hoss - > > I will try MultiPhraseQuery and report back. > > Back in an email thread with Doug be mentioned SpanQuery would work, and > in a fashion it does, but I can't differentiate between terms at the > same position and contiguous positions. The problem gets worse if I > want to test for more than two terms at the same position ( moon / noun > / end_of_sentence / ...), which opens it up to wider contiguous spaces. > > Would it make sense to extend SpanNearQuery for this case, perhaps with > a slop of a special value (-1) to indicate terms should be found at the > same position? I'm not sure how difficult this would be to represent in > the underlying queues keeping track of matches.
Fwiw, there is an alternative implementation of ordered NearSpans here: http://issues.apache.org/jira/browse/LUCENE-413 The NearSpansOrdered there might actually work as needed for this case. Spans are normally slower than Phrases, so I'd try the MultiPhraseQuery first. Regards, Paul Elschot > > ---Marc > > Erik Hatcher wrote: > > > Marc, > > > > SpanNearQuery isn't capable of performing the proximity to within > > only a single position in the manner you've described. A slop of 0 > > means the terms must be contiguous with no gaps, which also allows > > for matches in the same position as in your first example. > > > > I think MultiPhraseQuery (from svn trunk) will do what you want > > though. Please try that and report back on how it works for you. > > > > Erik > > > > > > >: i have a problem with a SpanNearQuery returning incorrect (false > >: positive) results. > > > >I'm not familiar with how Span queries are implimented, but there doesn't > >appear to be any test cases dealing with an index where term position > >increments are ever 0, so i can neither confirm nor deny the bug you're > >seeing. > > > >Can you post code demonstrating the problem? ideally in the form of > >a simple, self contained, JUnit test? > > > > > > > >-Hoss > > > > > > > > On Jan 4, 2006, at 9:39 PM, Marc Hadfield wrote: > > > >> hello all - > >> > >> i have a problem with a SpanNearQuery returning incorrect (false > >> positive) results. > >> > >> I am creating the context of a field using tokens which have > >> position increment set to either 1 or 0. The position increment is > >> set to 0 for special tokens, in this case part-of-speech markers. > >> Thus, brackets set of position increments: > >> > >> [The __pos_dt] [cow __pos_noun] [jumped __pos_verb] [over > >> __pos_prep] [the __pos_dt] [moon __pos_noun] [. __pos_.] > >> > >> > >> My Span Query looks like: > >> SpanQuery sq = new SpanNearQuery(new SpanQuery[] > >> { > >> new SpanTermQuery(new Term("content", > >> "jumped")), > >> new SpanTermQuery(new > >> Term("content", "__pos_verb")) > >> }, 0, false); > >> > >> This correctly finds the span: [jumped __pos_verb] > >> > >> However, if I query: > >> SpanQuery sq = new SpanNearQuery(new SpanQuery[] > >> { > >> new SpanTermQuery(new Term("content", > >> "jumped")), > >> new SpanTermQuery(new > >> Term("content", "__pos_noun")) > >> }, 0, false); > >> > >> This incorrectly finds the span: [cow __pos_noun] [jumped __pos_verb] > >> > >> This is wrong because there is a distance of 1 between the tokens, > >> not 0. > >> > >> I am using a recent version of Lucene from SVN. > >> > >> I am thinking that the problem is related to the position increment > >> being set to 0 for the first token of the incorrect "match" -- thus > >> perhaps this is a bug in the SpanNearQuery? > >> > >> > >> Best, > >> Marc > >> > >> > >> > >> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]