How to correctly deal with unicode strings?

Gary Willoughby Wed, 27 Nov 2013 06:36:46 -0800

I've just been reading this article:http://mortoray.com/2013/11/27/the-string-type-is-broken/ andwanted to test if D performed in the same way as he describes,i.e. unicode strings being 'broken' because they are just arrays.

Although i understand the difference between code units and codepoints it's not entirely clear in D what i need to do to avoidthe situations he describes. For example:


import std.algorithm;
import std.stdio;

void main(string[] args)
{
        char[] x = "noël".dup;

        assert(x.length == 6); // Actual
        // assert(x.length == 4); // Expected.

        assert(x[0 .. 3] == "noe".dup); // Actual.
        // assert(x[0 .. 3] == "noë".dup); // Expected.

        x.reverse;

        assert(x == "l̈eon".dup); // Actual
        // assert(x == "lëon".dup); // Expected.
}

Here i understand what is happening but how could i improve thisexample to make the expected asserts true?

How to correctly deal with unicode strings?

Reply via email to