Le mardi 10 mai 2011 à 09:52 +1000, Neil Hodgson a écrit : > Some C and C++ implementations currently allow non-ASCII > identifiers and the forthcoming C1X and C++0x language standards > include non-ASCII identifiers. The allowed characters are specified in > Annexes of the respective standards. > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E
I read these documents but they don't explain which encoding is used in libraries and programs. Does it mean that Windows and Linux may use different encodings? At least, the surrogate range (U+DC00-U+DFFF) is excluded, which is a good news (UTF-8 decoder of Python 3 rejects surrogate characters). I discovered -fextended-identifiers option of gcc: using this option, you can use \uHHHH and \UHHHHHHHH in identifiers, but not \xHH. On Linux, identifiers are encoded to UTF-8. Example: -------------- #define _ISOC99_SOURCE #include <stdio.h> int f\u00E9() { wprintf(L"U+00E9 = \xE9\n"); } int g\U000000E8() { wprintf(L"U+00E8 = \xE8\n"); } int main() { f\u00E9(); g\U000000E8(); return 0; } -------------- It's not very practical, I would prefer to write directly Unicode characters (as I can do in Python 3!). I'm not sure that chineses will prefer to call \u4f60\u597d() instead of hello(). Ok, I now agree, it is possible to use non-ASCII characters in C. But what about the encoding of symbols in a dynamic library: is it always UTF-8? Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com