https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224
--- Comment #13 from Manuel López-Ibáñez <manu at gcc dot gnu.org> --- (In reply to Eric from comment #12) > I'm glad to know people like Joseph are working on UTF-8 in gcc. I think at the moment, neither Joseph nor anyone else is planning to work on this. There doesn't seem to be sufficient demand for this feature so that companies fund it or volunteers step up to implement it (you are the first one to do an attempt that I am aware of). > I spent a week adding UTF-8 input support to pcc. At that time Microsoft > Studio and clang already supported UTF-8 input files and I expected that gcc > would do so in the next release. Unfortunately, GCC has very few developers compared to Microsoft or Clang. Many things in GCC will never get done if new people do not contribute to its development. This is why if you want to see this feature, you are the best and perhaps the only person to make it happen. The problem is that this cannot be fixed by one-line patch, otherwise it would have been fixed a long time ago. * GCC cannot rely on libiconv being always present. It has to work with glibc's iconv, which is what is used in GNU/Linux. * Even if glibc's supported C99 conversion, this will break other things. * You need to add tests explicitly for various things (see Joseph's comments). The tests will be added to the GCC testsuite to prove that your patch works as it should and to make sure future changes do not break the tests. * At a minimum, look at all the gcc.dg/cpp/ucnid-*.c g++.dg/cpp/ucnid-*.c and see what happens if you replace the \uNNN with actual extended characters. * Joseph thinks that the best approach is to do the conversion from UTF-8 to UCNs "manually" within cpplib, such that you can handle all the corner cases of C/C++ (quoted strings, \µ, macro names,...)