You are probably looking for ICU Folding which is part of ICU plugin: https://github.com/elasticsearch/elasticsearch-analysis-icu . It's not explained in details on that page, but you can see a long list of normalizations from the Lucene's Javadoc: http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/icu/ICUFoldingFilter.html
Overall, the explanation language is a little hairy and you may need to chase through the Unicode pages, but it should be the production-ready approach in the end. Regards, Alex. On 13 October 2014 15:30, Lee Gee <[email protected]> wrote: > I now the asciifolding filter docs are really very clear on this, but it > took me an embarrassingly long time to realise I was losing my currency > symbol (£) to the ASCII folding filter. > > Other than creating my own character map with the char map filter, does > there exist something of production quality that would translate accented > UTF8 characters of the Latin-alphabet into non-accented characters in the > ASCII range? > > TIA > Lee > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/ff95c6ec-7907-454e-bd58-774ee173f4e3%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEFAe-H-pePOqU6t4B0uD6iyeBdQ%3Dd6Wh498HJgv-M3W4crJsQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
