Re: What is the legal range of chars?

Ali Çehreli Wed, 19 Jun 2013 08:16:10 -0700

On 06/19/2013 05:34 AM, monarch_dodra wrote:

> I know a "binary" char can hold the values 0 to 0xFF. However, I'm
> wondering about the cases where a codepoint can fit inside a char. For

> example, 'ç' is represented by 0xe7, which technically fits inside achar.


'ç' is represented by 0xe7 in an encoding that is not UTF-8. :)

That would be a special agreement between the producer and the consumerof that string. Otherwise, 0xe7 is not 'ç'. I recommend ubyte[] forthose cases.


In UTF-8, 0xe7 is the first byte of a 3-byte code point:

import std.stdio;

void main()
{
    char[] a = [ 'a', 'b', 'c', 0xe7, 0x80, 0x80 ];
    writeln(a);
}

Prints a Chinese character:

abc瀀

Ali

Re: What is the legal range of chars?

Reply via email to