Hi! I asked this one already on the user mailing list but maybe it's more appropriate here:
As a simple example imagine every document in your index to have a field "language" and "country". A tuple of language+country is what I call a context. You want to search context-specific, i.e. language+country is always part of the query (QueryFilter). FuzzyTermEnum doesn't know about these contexts hence building a BooleanQuery of all similar terms. E.g. "hello" means "hallo" in german - only one character difference. But when searching in context english+USA I don't care about german terms. So I don't want/need "hallo" in the BooleanQuery in this case. So I came up with the idea to use reader.termDocs() instead of terms() in FuzzyTermEnum. By means of a QueryFilter (it's BitSet respectively) for each context I could determine whether a fuzzy term makes sense to be included in the BooleanQuery or not. This results (potentially) in a smaller BooleanQuery but I wonder whether this approach will gain any mentionable performance advantage (maybe reduce IO?). Thanks for feedback Timo --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
