Doug Cutting wrote:

Why not store them in the same field using positionIncrement=0 for the types? Then they won't change positions of non-type tokens. You should distinguish the types syntactically, e.g., prefix them with a space or other character that does not occur within words. That way queries on this field for the term "name" won't match a type token.

Doug


Hi Doug, thanks for your reply,

I actually mention your option in my email:

I would prefer not to mix the full text and "types" in the same field as it would make the term positions inconsistent which i depend on for other queries.

In principle I could store the full text in two fields with the second field containing the types without incrementing the token index. Then, do a SpanQuery for "Johnson" and "name" with a distance of 0. The resulting match would have a token position which would refer back to the matching position in the first field. I don't know if this is a really good idea.

ie Field_B = full text interlaced with "types" following each full text token with positionIncrement=0

However, as far as I understand, the standard TermQuery won't let me check if "Johnson" and "__name__" occur at the **same** position. Perhaps, as I ask above, a SpanQuery will allow multiple terms with a distance of zero (0) , that is they were indexed with positionIncrement=0 and SpanQuery can handle 0 distance terms?





---Marc





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to