Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

Victor Stinner Mon, 09 May 2011 17:59:07 -0700

Le mardi 10 mai 2011 à 09:52 +1000, Neil Hodgson a écrit :
>    Some C and C++ implementations currently allow non-ASCII
> identifiers and the forthcoming C1X and C++0x language standards
> include non-ASCII identifiers. The allowed characters are specified in
> Annexes of the respective standards.
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E


I read these documents but they don't explain which encoding is used in
libraries and programs. Does it mean that Windows and Linux may use
different encodings? At least, the surrogate range (U+DC00-U+DFFF) is
excluded, which is a good news (UTF-8 decoder of Python 3 rejects
surrogate characters).

I discovered -fextended-identifiers option of gcc: using this option,
you can use \uHHHH and \UHHHHHHHH in identifiers, but not \xHH. On
Linux, identifiers are encoded to UTF-8.

Example:
--------------
#define _ISOC99_SOURCE
#include <stdio.h>

int f\u00E9() { wprintf(L"U+00E9 = \xE9\n"); }

int g\U000000E8() { wprintf(L"U+00E8 = \xE8\n"); }

int main() { f\u00E9(); g\U000000E8(); return 0; }
--------------

It's not very practical, I would prefer to write directly Unicode
characters (as I can do in Python 3!). I'm not sure that chineses will
prefer to call \u4f60\u597d() instead of hello().

Ok, I now agree, it is possible to use non-ASCII characters in C. But
what about the encoding of symbols in a dynamic library: is it always
UTF-8?

Victor

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

Reply via email to