------- Additional Comments From joseph at codesourcery dot com 2005-02-21 19:47 ------- Subject: Re: UCNs not recognized in identifiers (c++/c99)
On Mon, 21 Feb 2005, geoffk at geoffk dot org wrote: > > * These rules apply to identifiers as preprocessing tokens at any > > time, including before concatenation. So it is not the case in C99 > > that splitting an identifier anywhere yields two valid preprocessing > > tokens: the second half could begin with a UCN for a digit and not be > > a valid identifier. (Invalid identifiers in C99 don't require > > diagnostics, but I don't think we want to use this laxity.) > > The second half would a pp-number, instead. It is always true that > splitting an identifier between characters yields two valid > preprocessing tokens. It would not be a pp-number, as a UCN for a digit is still an identifier-nondigit rather than a digit in terms of the syntax and pp-numbers can't start with identifiers-nondigits. > > * All uses of identifiers and DECL_ASSEMBLER_NAME in the compiler > > should be audited to determine what sort of identifier is appropriate > > in each case. > > I don't understand this sentence. What different sorts of identifiers > are there, and how could they be appropriate or not appropriate? Identifiers found in input, with input spelling. (Input includes -D and -U options on the command line - in principle the command line should be interpreted in the user's locale by default just like source files.) UTF-8 (or, I suppose, UTF-EBCDIC) internally encoded identifiers. Identifiers in mangled form in any case where they are mangled for output. Identifiers in diagnostics (possibly including cases where bits of a diagnostic get built up with sprintf), which need converting to the user's locale for display or to be displayed using UCNs. I don't know if collect2 might also need to know something about extended identifiers. The aim is that every datastructure with an identifier should have the encoding (input, internal, output, diagnostic) well-defined and conversions between these should be handled properly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449