Doug Cutting wrote:
Why not store them in the same field using positionIncrement=0 for the
types? Then they won't change positions of non-type tokens. You
should distinguish the types syntactically, e.g., prefix them with a
space or other character that does not occur within words. That way
queries on this field for the term "name" won't match a type token.
Doug
Hi Doug, thanks for your reply,
I actually mention your option in my email:
I would prefer not to mix the full text and "types" in the same field
as it would make the term positions inconsistent which i depend on for
other queries.
In principle I could store the full text in two fields with the second
field containing the types without incrementing the token index.
Then, do a SpanQuery for "Johnson" and "name" with a distance of 0.
The resulting match would have a token position which would refer back
to the matching position in the first field. I don't know if this is
a really good idea.
ie Field_B = full text interlaced with "types" following each full text
token with positionIncrement=0
However, as far as I understand, the standard TermQuery won't let me
check if "Johnson" and "__name__" occur at the **same** position.
Perhaps, as I ask above, a SpanQuery will allow multiple terms with a
distance of zero (0) , that is they were indexed with
positionIncrement=0 and SpanQuery can handle 0 distance terms?
---Marc
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]