hello -
i am looking to perform queries efficiently across multiple fields that
have their token order synchronized, ie:
Field_A[100] has some relationship to Field_B[100]
for example, consider two fields, one the full text of an article and
the other the "type" of the token where type could be from { person,
company, date, ... }
So that for a Document:
Field_A : "Fred Johnson worked for Johnson and Johnson in 2001"
Field_B : "name name other other company company company other date"
and we wish to perform a query:
Field_A:"Johnson" AND Field_B:"name"
which would be true for token number 2 but not for 5 and 7
I think Span Queries could be adapted to this purpose, but I wanted to
get any thoughts from the list.
I would prefer not to mix the full text and "types" in the same field as
it would make the term positions inconsistent which i depend on for
other queries.
In principle I could store the full text in two fields with the second
field containing the types without incrementing the token index. Then,
do a SpanQuery for "Johnson" and "name" with a distance of 0. The
resulting match would have a token position which would refer back to
the matching position in the first field. I don't know if this is a
really good idea.
Any thoughts?
---Marc Hadfield
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]