Hello Michael, For the case of normalizing ü to ue, take a look at the german normalizer [1].
Regards, Markus [1] https://lucene.apache.org/core/7_6_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html -----Original message----- > From:Ralf Heyde <ralf.he...@gmx.de> > Sent: Tuesday 16th April 2019 20:28 > To: java-user@lucene.apache.org > Subject: Re: umlauts / diacritic expansion > > Hey, > > Take a look at Asciifoldingfilter - this one is quite generic. > > Does this answer your question? > > Cheers Ralf > > Von meinem iPhone gesendet > > > Am 16.04.2019 um 20:08 schrieb Michael Sokolov <msoko...@gmail.com>: > > > > I'm learning how to index/search German today and understanding that > > vowels with umlauts are conventionally expanded into two ASCII > > characters, eg "für" -> "fuer", so people may search for the expanded > > form "fuer", but they might also search with the diacritic, and > > finally they might lazily search using the stripped form "fur". > > > > My question: is there a standard CharFilter or TokenFilter that > > expands to both (ASCII) forms, for characters with umlauts and perhaps > > other diacritics I might be unaware of in other languages having > > similar multiple renderings in ASCII? > > > > -Mike > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org