Hi Jürgen,
 
I'm aware that mapping umlauts gets many false positives, but we have 
noticed that some of our users omit them while searching. I guess we'll 
have to make product decision there because we can not cover all use cases 
anyway.

Thanks for your response!

Best,

Kresimir


On Saturday, November 29, 2014 6:41:17 PM UTC+1, Jürgen Wagner (DVT) wrote:
>
>  Hello Kresimir,
>   as a native speaker of German and a linguist, I know you usually want to 
> preserve the umlaut, but for searches you may want to relax the precision 
> of matching. So, why not do precisely this? If you have "über" or "ueber" 
> in your query, replace it by "über OR ueber". And if you want to take care 
> of those Americans who believe these two dots do not carry any meaning at 
> all (heavy grin at this point), you may add even "OR uber". Syntactically, 
> "uber" is wrong. This would only be a convenience rule for users thinking 
> they can simply omit umlaut dots or who are incapable of typing umlaut 
> characters on their keyboards.
>
> Note: when it comes to German last names, the names Ganser, Gänser and 
> Gaenser would be considered three entirely different names, although the 
> alternative spelling (e.g., in plain e-mail addresses) of Gänser could be 
> Gaenser. Mapping umlauts will get you false positives.
>
> Also be careful with the reverse. "ue", "oe" and "ae" cannot simply be 
> spelled as "ü", "ö" or "ä". In a word like "Zooeingang" (zoo entrance), the 
> composite is actually made of "Zoo" and "Eingang", so the "oe" must not be 
> interpreted as "ö".
>
> Similar issues exist with "ß" and "ss".
>
> Well, most likely these funny cases won't matter too much, so I suggest to 
> try with a simple disjunctive expansion for a start.
>
> Best regards,
> --Jürgen
>
>  On Tue, Nov 18, 2014 at 12:30 PM, Krešimir Slugan <[email protected] 
> <javascript:>> wrote:
>
>>  Hi, 
>>
>>  To handle German language in search I have to be able to provide same 
>> results if user searches for e.g  über, uber or ueber
>>  
>>  I would do this at the index time where I would have über in the data.  But 
>> if I use just asciifolding filter I lose information that this was work 
>> with "umlaut" and I can't get ueber token. If I use char_fiter, it is 
>> applied before analysis and I would not be able to get uber. 
>>
>>  Is it possible to preserve original with char filter or apply it after 
>> the analysis?
>>
>> Cheers,
>>
>> Kresimir
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsUPgHpwYwruOc%3DLhhrb2JnEG5CWS5O4Nuj52vnty9yPA%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsUPgHpwYwruOc%3DLhhrb2JnEG5CWS5O4Nuj52vnty9yPA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>
> -- 
>
> Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С 
> уважением
> *i.A. Jürgen Wagner*
> Head of Competence Center "Intelligence"
> & Senior Cloud Consultant 
>
> Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
> Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
> E-Mail: [email protected] <javascript:>, URL: www.devoteam.de
> ------------------------------
> Managing Board: Jürgen Hatzipantelis (CEO)
> Address of Record: 64331 Weiterstadt, Germany; Commercial Register: 
> Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071 
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ec79cc5f-a6e1-4fc4-8f60-7f1ab31b60ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to