On Wednesday, 14 August 2013 at 03:16:08 UTC, Jeremy DeHaan wrote:
On Wednesday, 14 August 2013 at 02:53:43 UTC, jicman wrote:

Greetings.

import std.stdio;

void main()
{
 char[] str = "不良反應事件和產品客訴報告"; // 13 chinese characters...
 writefln(str.length);
}

this program returns 39. I expected to return 13. How do I know the exact length of the characters that I have in a char[] variable? Thanks.

josé

What version of DMD are you using? This code doesn't even compile for me. It gives me errors about not being able to convert type string to char[], like it should since a string literal is immutable data. To test the code I changed char[] to string. I also got an error for "writefln(str.length);" so I just changed that to "writeln(str.length);"

Anyways, from what I understand, the reason you get this is because each of those characters is greater than a single 8 byte representation. D's chars are utf-8, so that means it takes more than a single char to store the data needed to represent one of the chinese characters. str.length will give you the length of the string with respect to each char it contains. You have 13 characters in your string, but you need 39 chars to store the data to represent them.

Alternatively, you can use a different encoding to see the actual number of characters in your string, eg. wstring or dstring. I usually use dstrings when working with unicode personally.

This is D1. Forgot to mention that. I am still in the old ages. :-) thanks for the insight. I figured that much, but I need to know go and try to figure out what to do with both western character set as well as the asian, hebrew, etc. Thanks.

Reply via email to