> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can > only be ASCII: the C language doesn't accept non-ASCII identifiers.
That's not exactly true. In C89, source code is in the "source character set", which is implementation-defined, except that it must contain the "basic character set". I believe that it allows for implementation-defined characters in identifiers. In C99, this is extended to include "universal character names" (\u escapes). They may appear in identifiers as long as the characters named are listed in annex D.59 (which I cannot locate). In C 2011, annexes D.1 and D.2 specify the characters that you can use in an identifier: D.1 Ranges of characters allowed 1. 00A8, 00AA, 00AD, 00AF, 00B2−00B5, 00B7−00BA, 00BC−00BE, 00C0−00D6, 00D8−00F6, 00F8−00FF 2. 0100−167F, 1681−180D, 180F−1FFF 3. 200B−200D, 202A−202E, 203F−2040, 2054, 2060−206F 4. 2070−218F, 2460−24FF, 2776−2793, 2C00−2DFF, 2E80−2FFF 5. 3004−3007, 3021−302F, 3031−303F 6. 3040−D7FF 7. F900−FD3D, FD40−FDCF, FDF0−FE44, FE47−FFFD 8. 10000−1FFFD, 20000−2FFFD, 30000−3FFFD, 40000−4FFFD, 50000−5FFFD, 60000−6FFFD, 70000−7FFFD, 80000−8FFFD, 90000−9FFFD, A0000−AFFFD, B0000−BFFFD, C0000−CFFFD, D0000−DFFFD, E0000−EFFFD D.2 Ranges of characters disallowed initially 1. 0300−036F, 1DC0−1DFF, 20D0−20FF, FE20−FE2F Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com