On Tue, Dec 03, 2002 at 10:33:19PM -0800, H. Peter Anvin wrote: > Followup to: <[EMAIL PROTECTED]> > By author: Jungshik Shin <[EMAIL PROTECTED]> > In newsgroup: linux.utf8 > > > > That's simple, but how would you deal with the fact that > > Unicode has multiple representations of what people would usually > > regard as equivalent? To enable UTF-8 identifiers, that has > > to be taken care of by gcc and linker (if gcc doesn't do a compile-time > > normalization). > > > > I don't really think normalization is a major issue here. Maybe it > should be, but I suspect it isn't a problem in practice. I suspect > attempting normalization would cause more problems that it's worth. > > Maybe a --normalize-utf option to the linker might be a good idea, but > it should be an option, IMO.
First of all, the standard does not refer to Unicode, but to 10646. And the C standard does not use Unicode normalization. There is a list in the ISO C standard of 10646 characters that are allowed in identifiers, and these do not have alternate representations. Kind regards keld -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
