> Does anyone know how to deal with these 2 issues when using
> NGramFilterFactory for autocomplete?
>
> 1) hyphens - if user types "ema" or "e-ma" I want to
> suggest "email"
>
> 2) accents - if user types "herme" want to suggest
> "Hermès"
Accents can be removed with using MappingCharFilterFactory before the
tokenizer. (both index and query time)
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
I am not sure if this is most elegant solution but you can replace - with ""
uing MappingCharFilterFactory too. It satisfies what you describe in 1.
But generally NGramFilterFactory produces a lot of tokens. I mean query er can
return hermes. May be EdgeNGramFilterFactory can be more suitable for
auto-complete task. At least it guarantees that some word is starting with that
character sequence.