hello -

i am looking to perform queries efficiently across multiple fields that have their token order synchronized, ie:

Field_A[100] has some relationship to Field_B[100]

for example, consider two fields, one the full text of an article and the other the "type" of the token where type could be from { person, company, date, ... }

So that for a Document:
Field_A : "Fred Johnson worked for Johnson and Johnson in 2001"
Field_B : "name name other other company company company other date"

and we wish to perform a query:
Field_A:"Johnson" AND Field_B:"name"
which would be true for token number 2 but not for 5 and 7

I think Span Queries could be adapted to this purpose, but I wanted to get any thoughts from the list.

I would prefer not to mix the full text and "types" in the same field as it would make the term positions inconsistent which i depend on for other queries.

In principle I could store the full text in two fields with the second field containing the types without incrementing the token index. Then, do a SpanQuery for "Johnson" and "name" with a distance of 0. The resulting match would have a token position which would refer back to the matching position in the first field. I don't know if this is a really good idea.

Any thoughts?

---Marc Hadfield


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to