Suppose I have an index containing the terms impostor, imposter, fraud, and fruad, then presumably regardless of whether I spell impostor and fraud correctly, Lucene SpellChecker will offer the improperly spelled versions as corrections. This means that the phrase "The login fraud involves an impostor" would need to expand to:
"The login fraud involves an impostor" OR "The login fruad involves an impostor" OR "The login fraud involves an imposter" OR "The login fruad involves an imposter" to cover all cases and thus find all possible matches. However, that feels like an aweful a lot of matches to perform on the index. A more efficient approach would be to expand the query to "The login (fraud OR fruad) involves an (impostor OR imposter)", which should be logically equivalent to the first (longer) query. So my question is (1) if others have generated the "The login (fraud OR fruad) involves an (impostor OR imposter)" types of queries when applying SpellChecker to a phrase, and agreed that this indeed performs better than the first one. (2) if others have observed any problems in doing so in terms of performance or anything else Any information would be appreciated.
