[PATCH] D106577: [clang] Define __STDC_ISO_10646__

Joerg Sonnenberger via Phabricator via cfe-commits Fri, 23 Jul 2021 12:37:32 -0700

joerg added a comment.

In D106577#2899715 <https://reviews.llvm.org/D106577#2899715>, @aaron.ballman 
wrote:

> In D106577#2899711 <https://reviews.llvm.org/D106577#2899711>, @joerg wrote:
>
>> This patch is certainly wrong for NetBSD as the wchar_t encoding is up to 
>> the specific locale charset and *not* UCS-2 or UCS-4 for certain legacy 
>> encodings like the various shift encodings in East Asia.
>
> How does the value of a macro get impacted by a runtime locale?

NetBSD doesn't set the macro. And yes, this is one of the fundamental design 
issues of long char literals. Section 2 of the following now 20 year old Itojun 
paper goes into some of the problems with the assumption of a single universal 
character set:
https://www.usenix.org/legacy/publications/library/proceedings/usenix01/freenix01/full_papers/hagino/hagino.pdf
Even an encoding that embeds ISO 10646 fully and uses a flag bit to denote 
values (entirely valid as Unicode is restricted to 21bit) should not get this 
macro set.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106577/new/

https://reviews.llvm.org/D106577

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106577: [clang] Define __STDC_ISO_10646__

Reply via email to