D strings have dual nature. They behave as arrays of code units when slicing or accessing .length directly (because of O(1) guarantees for those operations) but all algorithms in standard library work with them as with arrays of dchar:

import std.algorithm;
import std.range : walkLength, take;
import std.array : array;

void main(string[] args)
{
        char[] x = "noël".dup;

        assert(x.length == 6);
        assert(x.walkLength == 5); // ë is two symbols on my machine

        assert(x[0 .. 3] == "noe".dup); // Actual.
        assert(array(take(x, 4)) == "noë"d);

        x.reverse;

        assert(x == "l̈eon".dup); // Actual and correct!
}

Problem you have here is that ë can be represented as two separate Unicode code points despite being single drawn symbol. It has nothing to do with strings as arrays of code units, using array of `dchar` will result in same behavior.

Reply via email to