[issue6543] traceback presented in wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: Update and improve the patch: - Update the patch to py3k (replace tabs by spaces) - check if _PyUnicode_AsString() result is NULL - _Py_FindSourceFile() returns the file instead of NULL on success! - use directly utf-8 instead of calling PyUnicode_GetDefaultEncoding() for the default source code encoding (which is constant) - use PyUnicode_FromFormat() instead of PyOS_snprintf() in tb_displayline() to avoid conversion from unicode to utf-8 and then convert utf-8 back to unicode (in PyFile_WriteString). name type is now PyObject* - reindent also PyTracebackObject structure in traceback.h, just because I hate tabs :-) -- Added file: http://bugs.python.org/file17702/traceback-encoding-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: I tested the last patch on Windows: it does fix the bug, the traceback is displayed correctly in my terminal charset (cp850). I commited the fix to Python 3.1 (r82063) and 3.2 (r82059+r82061). -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: In issue3343, we chose to mark this function as private. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: The patch changes the prototype of _Py_DisplaySourceLine() function. Is it possible that a third party module uses this function? Should we keep backward compatibility with third pary modules using the private C API? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: surrogateescape characters are not printable stderr uses backslashescape error handler, and so non-decodable characters will be displayed as \xHH. ... see also #8092 :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: Storing unicode in c_filename would not solve the problem: surrogateescape characters are not printable. There is no need to support non-decodable filenames in the import mechanism. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: in compile.c, the c_filename member has utf8 encoding The problem is maybe that c_filename should be an unicode object created using the file system default encoding and the surrogateescape error handler, to be able to store undecodable filenames (useful on POSIX OS using a byte string API, eg. Linux). -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Changes by Florent Xicluna florent.xicl...@gmail.com: -- nosy: +flox stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Sean Reifschneider j...@tummy.com added the comment: From a cursory glance, I don't see any problems with this patch. Though I admit that I don't know the traceback code nearly as well as you, Amaury. The tests pass on py3k trunk on my Linux box. If you want other review, perhaps ask on python-dev? -- nosy: +jafo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Changes by Sean Reifschneider j...@tummy.com: -- priority: - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
New submission from Fan Decheng fandech...@gmail.com: traceback information is wrongly encoded. Steps to reproduce: 1. Use a version of Windows that supports CP936 (Simplified Chinese) as the default encoding. 2. Create a directory containing Chinese characters. Such as C:\测试 3. In the directory create a python file such as C:\测试\test.py 4. In the python file enter the following lines import traceback try: aaa # create a non-existent name except Exception as ex: traceback.print_exc() 5. Run the program with this command line (remember to use full path to the test.py file): C:\Python31\python.exe C:\测试\test.py 6. See the output. Expected result: There is correct output without encoding problems. Such as: Traceback (most recent call last): File C:\测试\test.py, line 3, in module NameError: name 'aaa' is not defined Actual result: UTF-8 encoded string is decoded using CP936 so the output is incorrect. Traceback (most recent call last): File C:\娴嬭瘯\test.py, line 3, in module NameError: name 'aaa' is not defined Additional information: In Python 3.0, such test would generate: File decoding error, line 221, in main In Python 3.1, the test generates the output mentioned in the repro steps. As I tried traceback.format_exc(), it seems the original characters 测试 have become three Unicode characters when returned by format_exc(). -- components: Interpreter Core messages: 90803 nosy: r_mosaic severity: normal status: open title: traceback presented in wrong encoding type: behavior versions: Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: This also happens on a Western Windows (cp437, mbcs==cp1252) with a filename like café.py. The attached patch corrects three problems: - in compile.c, the c_filename member has utf8 encoding, and must not be decoded with PyUnicode_DecodeFSDefault. This is the reported issue. - Same thing in pythonrun.c, if you want print(__file__) to work. - in traceback.c, the content of the file is not shown. Tested with this script: = print(file name:, __file__) import traceback try: aaa except: traceback.print_exc() raise = The output should be: = file name: c:\temp\café.py Traceback (most recent call last): File c:\temp\café.py, line 4, in module aaa NameError: name 'aaa' is not defined Traceback (most recent call last): File c:\temp\café.py, line 4, in module aaa NameError: name 'aaa' is not defined = -- keywords: +patch nosy: +amaury.forgeotdarc Added file: http://bugs.python.org/file14538/traceback-encoding.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6543] traceback presented in wrong encoding
Changes by Amaury Forgeot d'Arc amaur...@gmail.com: -- keywords: +needs review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6543 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com