New submission from STINNER Victor:

Since the changeset 45079ad1e260 (issue #4388), command line arguments are 
decoded from UTF-8 instead of the locale encoding. Functions of 
Python/fileutils.c are still using the locale encoding.

It does not work: see issue #16218. On Mac OS X, in the command line "python 
script.py", the filename "script.py" is decoded from UTF-8 (by 
_Py_DecodeUTF8_surrogateescape) but then it is passed to _Py_fopen() which 
encodes the filename to the locale encoding (ex: ISO-8859-1 if $LANG, $LC_CTYPE 
and $LC_ALL environment variables are not set). The result is mojibake and 
Python fails to open the script.

Attached patch modifies function of Python/fileutils.c to use UTF-8 to encode 
and decode filenames, instead of the locale encoding on Mac OS X.

I don't know yet if Module/getpath.c should also decode paths from UTF-8 
instead of the locale encoding on Mac OS X. We may expose _Py_decode_filename().

----------
assignee: ronaldoussoren
components: Macintosh
files: macosx.patch
keywords: patch
messages: 174943
nosy: haypo, ronaldoussoren
priority: normal
severity: normal
status: open
title: Mac OS X: don't use the locale encoding but UTF-8 to encode and decode 
filenames
versions: Python 3.4
Added file: http://bugs.python.org/file27903/macosx.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16416>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to