Desilets, Alain wrote on 2/1/12 10:15 AM:
> Thx Peter. Would this encur the same performance problem as tokenizing the 
> string on a character by character basis?

WildcardQuery is slower than a TermQuery. It's all at search time though,
whereas tokenizing the string on a character basis happens at index time and
search time.

Your use case will incur a performance hit no matter what. In my apps, I
tokenize substrings for only particular fields at index time, and do some term
expansion instead of wildcards using a custom lexicon at search time. IME, it's
about finding a balance in your architecture to best fit your actual use cases.
Accuracy vs speed, is one balance to find. The use case you described (finding
all docs with a field matching a particular hostname) could be accomplished with
no change in indexing or tokenizing, if you used the WildcardQuery; whether that
proves too slow depends on your requirements. Try it and see.

-- 
Peter Karman  .  http://peknet.com/  .  [email protected]

Reply via email to