On Wednesday, 14 August 2013 at 02:53:43 UTC, jicman wrote:
know the exact length of the characters that I have in a char[] variable? Thanks.

Your code looks like D1...

in D1 or D2:
import std.uni;
dstring s2 = toUTF32(str);
writeln(s2.length); // 13


in D2 you can do it a little more efficiently like this:

import std.range;
writeln(walkLength(str)); // 13



The reason it shows 39 instead of 13 is that the char[] is UTF-8, and Chinese characters are multi-byte characters in utf-8. The .length property gives the number elements in the array, which are bytes in utf-8.

dstring uses UTF-32, which has a consistent size for each code point. Which isn't technically quite the same as a character actually, but close enough that it works here.


Bottom line though, char[] for non-English text tends to have a longer length than you expect because a lot of characters are multi-byte in utf8. If you use dstring, the length is more consistent.

Reply via email to