: 1) Use a modified SpanNearQuery. If we assume that country + phone will always
: be one token, we can rely on the fact that the positions of 'au' and '5678' in
: Fred's document will be different.
:
: SpanQuery q1 = new SpanTermQuery(new Term("addresscountry", "au"));
: SpanQuery q2 = new SpanTermQuery(new Term("addressphone", "5678"));
: SpanQuery snq = new SpanNearQuery(new SpanQuery[]{q1, q2}, 0, false);
:
: the slop of 0 means that we'll only return those where the two terms are in
: the same position in their respective fields. This works brilliantly, BUT
: requires a change to SpanNearQuery's constructor (which checks that all the
: clauses are against the same field). Are people amenable to perhaps adding
: another constructor to SNQ which doesn't do the check, or subclassing it to do
: the same (give it a protected non-checking constructor for the subclass to
: call)?
this has actually come up a couple of times over the years (i think Doug
was the first person i ever heard suggest it) in the context of
PhraseQuery ... the initial thought was that just removing the
term1.field=term2.field assertion would allow something liek this to work,
but i don't think anyone every tried creating a patch w/tests to verify
it.
I think it would be a great idea.
: 2) It gets slightly more complicated in the case of variable-length terms. For
...
: getPositionIncrementGap -- if we knew that 'address' would be, at most, 20
: tokens, we might use a position increment gap of 100, and make the slop factor
: 50; this works fine for the simple case (yay!), but with a great many
: addresses-per-user starts to get more complicated, as the gap counts from the
: last term (so the position sequence for a single value field might be 0, 100,
: 200, but for the address field it might be 0, 1, 2, 3, 103, 104, 105, 106,
: 206, 207... so it's going to get out of sync). The simplest option here seems
couldn't this be solved by an Analyzer that counts the token per fieldname
and implements getPositionIncrementGap as..
int result - SOME_BIG_NUM - tokensSeenMap.get(fieldname);
tokensSeenMap.put(fieldname, 0);
return result;
?
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]