: 1) Use a modified SpanNearQuery. If we assume that country + phone will always : be one token, we can rely on the fact that the positions of 'au' and '5678' in : Fred's document will be different. : : SpanQuery q1 = new SpanTermQuery(new Term("addresscountry", "au")); : SpanQuery q2 = new SpanTermQuery(new Term("addressphone", "5678")); : SpanQuery snq = new SpanNearQuery(new SpanQuery[]{q1, q2}, 0, false); : : the slop of 0 means that we'll only return those where the two terms are in : the same position in their respective fields. This works brilliantly, BUT : requires a change to SpanNearQuery's constructor (which checks that all the : clauses are against the same field). Are people amenable to perhaps adding : another constructor to SNQ which doesn't do the check, or subclassing it to do : the same (give it a protected non-checking constructor for the subclass to : call)?
this has actually come up a couple of times over the years (i think Doug was the first person i ever heard suggest it) in the context of PhraseQuery ... the initial thought was that just removing the term1.field=term2.field assertion would allow something liek this to work, but i don't think anyone every tried creating a patch w/tests to verify it. I think it would be a great idea. : 2) It gets slightly more complicated in the case of variable-length terms. For ... : getPositionIncrementGap -- if we knew that 'address' would be, at most, 20 : tokens, we might use a position increment gap of 100, and make the slop factor : 50; this works fine for the simple case (yay!), but with a great many : addresses-per-user starts to get more complicated, as the gap counts from the : last term (so the position sequence for a single value field might be 0, 100, : 200, but for the address field it might be 0, 1, 2, 3, 103, 104, 105, 106, : 206, 207... so it's going to get out of sync). The simplest option here seems couldn't this be solved by an Analyzer that counts the token per fieldname and implements getPositionIncrementGap as.. int result - SOME_BIG_NUM - tokensSeenMap.get(fieldname); tokensSeenMap.put(fieldname, 0); return result; ? -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org