Re: DMD: invalid UTF character `\U0000d800`

Jacob Carlborg via Digitalmars-d-learn Sat, 07 Nov 2020 09:51:15 -0800

On Saturday, 7 November 2020 at 16:12:06 UTC, Per Nordlöw wrote:

CtoLexer_parser.d 665 57 error invalid UTFcharacter \U0000d800CtoLexer_parser.d 665 67 error invalid UTFcharacter \U0000dbffCtoLexer_parser.d 666 28 error invalid UTFcharacter \U0000d800CtoLexer_parser.d 666 38 error invalid UTFcharacter \U0000dbffCtoLexer_parser.d 666 53 error invalid UTFcharacter \U0000dc00CtoLexer_parser.d 666 63 error invalid UTFcharacter \U0000dfff
Doesn't DMD support these Unicodes yet?


They're not valid:

"The Unicode standard permanently reserves these code pointvalues for UTF-16 encoding of the high and low surrogates, andthey will never be assigned a character, so there should be noreason to encode them. The official Unicode standard says that noUTF forms, including UTF-16, can encode these code points" [1].

"... the standard states that such arrangements should be treatedas encoding errors" [1].

Perhaps they need to be combined with other code points to form avalid character.


[1] https://en.wikipedia.org/wiki/UTF-16#U+D800_to_U+DFFF

--
/Jacob Carlborg

Re: DMD: invalid UTF character `\U0000d800`

Reply via email to