> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can
> only be ASCII: the C language doesn't accept non-ASCII identifiers.

That's not exactly true. In C89, source code is in the "source character
set", which is implementation-defined, except that it must contain
the "basic character set". I believe that it allows for
implementation-defined characters in identifiers. In C99, this is
extended to include "universal character names" (\u escapes). They may
appear in identifiers
as long as the characters named are listed in annex D.59 (which I cannot
locate).

In C 2011, annexes D.1 and D.2 specify the characters that you can use
in an identifier:

D.1 Ranges of characters allowed
1. 00A8, 00AA, 00AD, 00AF, 00B2−00B5, 00B7−00BA, 00BC−00BE, 00C0−00D6,
00D8−00F6, 00F8−00FF
2. 0100−167F, 1681−180D, 180F−1FFF
3. 200B−200D, 202A−202E, 203F−2040, 2054, 2060−206F
4. 2070−218F, 2460−24FF, 2776−2793, 2C00−2DFF, 2E80−2FFF
5. 3004−3007, 3021−302F, 3031−303F
6. 3040−D7FF
7. F900−FD3D, FD40−FDCF, FDF0−FE44, FE47−FFFD
8. 10000−1FFFD, 20000−2FFFD, 30000−3FFFD, 40000−4FFFD, 50000−5FFFD,
60000−6FFFD, 70000−7FFFD, 80000−8FFFD, 90000−9FFFD, A0000−AFFFD,
B0000−BFFFD, C0000−CFFFD, D0000−DFFFD, E0000−EFFFD

D.2 Ranges of characters disallowed initially
1. 0300−036F, 1DC0−1DFF, 20D0−20FF, FE20−FE2F

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to