STINNER Victor added the comment:

I didn't understand Serhiy's "ls" example. I tried:

$ mkdir unicode
$ cd unicode
$ python3 -c 'open("ab\xe9.txt", "w").close()'
$ python3 -c 'open("euro\u20ac.txt", "w").close()'
$ ls
abé.txt  euro€.txt
$ LANG=C ls
ab??.txt  euro???.txt


Ah yes, I didn't remember that "ls" is aware of the locale encoding.

printf() and wprintf() behave differently on unencodable/undecoable characters:
http://unicodebook.readthedocs.org/en/latest/programming_languages.html#printf-functions-family

Again, the issue is not specific to Python. So it's time to learn how to 
configure correctly your locales.

About the "interoperability" point I mentionned in my first message ("This 
encoding is the best choice for interopability with other (python2 or non 
python) programs."): if you work around the annoying ASCII encoding by forcing 
UTF-8 encoding, Python may produce data which would be incompatible with other 
applications following POSIX and so using the ASCII encoding.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to