On 5/18/20 9:44 AM, Martin Tschierschke wrote:
Hi, I have to find a certain line in a file, with a text containing umlauts.How do you do this? The following was not working: foreach(i,line; file){ if(line=="My text with ö oe, ä ae or ü"){ writeln("found it at line",i) } } I ended up using line.canFind("with part of the text without umlaut").It solved the problem, but what is the right way to use umlauts (encode them) inside the program?
using == on strings is going to compare the exact bits for equality. In unicode, things can be encoded differently to make the same grapheme. For example, ö is a code unit that is the o with a diaeresis (U+00F6). But you could encode it with 2 code points -- a standard o, and then an diaeresis combining character (U+006F, U+0308)
What you need is to normalize the data for comparison: https://dlang.org/phobos/std_uni.html#normalize
For more reference: https://en.wikipedia.org/wiki/Combining_character -Steve
