Brett Cannon <br...@python.org> added the comment: Here is what I have found out so far. Python/bltinmodule.c:builtin_compile takes in a PyObject and gets the char * representation of that object and passes it to Python/pythonrun.c:Py_CompileStringFlags. Unfortunately no other information is passed along in the call, including what the encoding happens to be. This is unfortunate as builtin_compile makes sure that the char* data is encoded using the default encoding before calling Py_CompileStringFlags.
I just tried setting a PyCF flag to denote that the char* data is encoded using the default encoding, but Parser/tokenizer.c is not compiled against unicodeobject.c and thus one cannot use PyUnicode_GetDefaultEncoding() to know what the data is stored as. I'm going to try to explicitly convert to UTF-8 and see if that works. _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4626> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com