> > to Unicode/UTF-8 before (and after?) processing. They also have to > > handle \uXXXX and \UXXXXXXXX as Unicode regardless of input format, so > > I agree that this is more urgent than supporting UTF-8 chars. in > identifiers.
Im not sure I agree yet that the escape sequences are more important. It seems so simple to simply "switch off" whatever error code gcc generates when it gets bytes above 7F, and just allow them through. This would work fine with the filesystem, assuming its it utf-8 as well, #include's DO work fine with utf-8 filenames. (I just tried this with gcc under RH8) Text strings and comments already work fine with utf-8. Just identifiers dont. I think even a "use at your own risk" command line switch, such as "--allow-high-ascii" would be a huge step forward. -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
