[Bug c/67224] UTF-8 support for identifier names in GCC

ejolson at unr dot edu Mon, 17 Aug 2015 17:12:13 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224


--- Comment #12 from Eric <ejolson at unr dot edu> ---
I'm glad to know people like Joseph are working on UTF-8 in gcc.  Last year I
spent a week adding UTF-8 input support to pcc.  At that time Microsoft Studio
and clang already supported UTF-8 input files and I expected that gcc would do
so in the next release.  As this didn't happen, a few months ago I looked and
developed a one-line patch to add this support to gcc.

It appears the C preprocessor falls back to libiconv when it encounters a
conversion not supported internally.  From what I can tell this is enabled by
default, though it is surely possible to disable it.

I'm aware that C strings are often used to store 8-bit data, for example, to
display various graphics characters from legacy code pages.  I will run the
regression tests as soon as possible to see what, if anything, has broken by my
one-line patch.  UCN quoting of UTF-8 input should happen only if the
-finput-charset=UTF-8 flag is set and this is worth checking.

[Bug c/67224] UTF-8 support for identifier names in GCC

Reply via email to