On Wednesday, 14 August 2013 at 03:16:08 UTC, Jeremy DeHaan wrote:
On Wednesday, 14 August 2013 at 02:53:43 UTC, jicman wrote:
Greetings.
import std.stdio;
void main()
{
char[] str = "不良反應事件和產品客訴報告"; // 13 chinese characters...
writefln(str.length);
}
this program returns 39. I expected to return 13. How do I
know the exact length of the characters that I have in a
char[] variable? Thanks.
josé
What version of DMD are you using? This code doesn't even
compile for me. It gives me errors about not being able to
convert type string to char[], like it should since a string
literal is immutable data. To test the code I changed char[] to
string. I also got an error for "writefln(str.length);" so I
just changed that to "writeln(str.length);"
Anyways, from what I understand, the reason you get this is
because each of those characters is greater than a single 8
byte representation. D's chars are utf-8, so that means it
takes more than a single char to store the data needed to
represent one of the chinese characters. str.length will give
you the length of the string with respect to each char it
contains. You have 13 characters in your string, but you need
39 chars to store the data to represent them.
Alternatively, you can use a different encoding to see the
actual number of characters in your string, eg. wstring or
dstring. I usually use dstrings when working with unicode
personally.
This is D1. Forgot to mention that. I am still in the old ages.
:-) thanks for the insight. I figured that much, but I need to
know go and try to figure out what to do with both western
character set as well as the asian, hebrew, etc. Thanks.