V Fri, Mar 24, 2023 at 11:01:04AM -0700, Adam Wozniak napsal(a): > Using "indent" on a C file with structure members with UTF-8 names (as > allowed under C99 and later). > > indent completely mangles these member names, inserting spaces between UTF8 > bytes. > > -double ə14(double GST, struct φλ φλ) {
C99 leaves Unicode characters in identifiers as an implementation-defined option: An implementation may allow multibyte characters that are not part of the basic source character set to appear in identifiers; which characters and their correspondence to universal character names is implementation-defined. You probably mistaken Unicode characters with Unicode character names (a sequence like \uNNNN and \UNNNNNNNN): Universal character names may be used in identifiers, character constants, and string literals to designate characters that are not in the basic character set. Hence C99-conforming compiler must support: double ə14(double GST, struct \u03c6\u03bb \u03c6\u03bb); but may support: double ə14(double GST, struct φλ φλ); while the interoperbility of the latter (e.g. linking to compilation units together) is completely unspecified. I don't say that cindent could not support Unicode characters (probably depending on a locale because indent needs understand them to align columns properly). Only that your claim about UTF-8 support in C99 is misleading. -- Petr
signature.asc
Description: PGP signature