STINNER Victor added the comment: I didn't understand Serhiy's "ls" example. I tried:
$ mkdir unicode $ cd unicode $ python3 -c 'open("ab\xe9.txt", "w").close()' $ python3 -c 'open("euro\u20ac.txt", "w").close()' $ ls abé.txt euro€.txt $ LANG=C ls ab??.txt euro???.txt Ah yes, I didn't remember that "ls" is aware of the locale encoding. printf() and wprintf() behave differently on unencodable/undecoable characters: http://unicodebook.readthedocs.org/en/latest/programming_languages.html#printf-functions-family Again, the issue is not specific to Python. So it's time to learn how to configure correctly your locales. About the "interoperability" point I mentionned in my first message ("This encoding is the best choice for interopability with other (python2 or non python) programs."): if you work around the annoying ASCII encoding by forcing UTF-8 encoding, Python may produce data which would be incompatible with other applications following POSIX and so using the ASCII encoding. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19846> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com