Desilets, Alain wrote on 2/1/12 10:15 AM: > Thx Peter. Would this encur the same performance problem as tokenizing the > string on a character by character basis?
WildcardQuery is slower than a TermQuery. It's all at search time though, whereas tokenizing the string on a character basis happens at index time and search time. Your use case will incur a performance hit no matter what. In my apps, I tokenize substrings for only particular fields at index time, and do some term expansion instead of wildcards using a custom lexicon at search time. IME, it's about finding a balance in your architecture to best fit your actual use cases. Accuracy vs speed, is one balance to find. The use case you described (finding all docs with a field matching a particular hostname) could be accomplished with no change in indexing or tokenizing, if you used the WildcardQuery; whether that proves too slow depends on your requirements. Try it and see. -- Peter Karman . http://peknet.com/ . [email protected]
