[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: Antoine Pitrou added the comment: Python uses the fact that the filesystem encoding is the locale encoding in various places. The patch doesn't change that. Nick Coghlan added the comment: Note that the *only* change Antoine's patch makes is that: - *if*

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Nick Coghlan
Nick Coghlan added the comment: Yes, that's the point. *Every* case I've seen where the locale encoding has been reported as ASCII on a modern Linux system has been because the environment has been configured to use the C locale, and that locale has a silly, antiquated, encoding setting.

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: 2013/12/8 Nick Coghlan rep...@bugs.python.org: Yes, that's the point. *Every* case I've seen where the locale encoding has been reported as ASCII on a modern Linux system has been because the environment has been configured to use the C locale, and that

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: If you use a different encoding but only just for filenames, you will get mojibake when you pass a filename on the command line or in an environment varialble. That's not what the patch does. -- ___ Python

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: 2013/12/8 Antoine Pitrou rep...@bugs.python.org: Python uses the fact that the filesystem encoding is the locale encoding in various places. The patch doesn't change that. You wrote: - With the patch: utf-8 utf-8 utf-8 ANSI_X3.4-1968, so os.get

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Setting sys.stderr encoding to UTF-8 on ASCII locale is wrong. sys.stderr has the backslashreplace error handler by default, so it newer fails and should newer produce non-ASCII data on ASCII locale. -- nosy: +serhiy.storchaka

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Larry Hastings
Larry Hastings added the comment: Antoine: are you characterizing this as a bug rather than a new feature? I'd like to see more of a consensus before something like this gets checked in. Right now I see a variety of opinions. When I think conservative approach and knows about system encoding

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: Or said differently, the filesystem encoding is different than the locale encoding. Indeed, but the FS encoding and the IO encoding are the same. locale encoding doesn't really matter here, as we are assuming that it's wrong. --

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread Nick Coghlan
Nick Coghlan added the comment: Victor, people set LANG=C for all sorts of reasons, and we have no control over how operating systems define that locale. The user perception is Python 3 doesn't work properly when you ssh into systems, not Gee, I wish operating systems defined the C locale more

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: haypo: title: Setting LANG=C breaks Python 3 - print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding() Oh, I didn't want to change the title of the issue, it's a bug in Roundup when I reply by email :-/ --

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread STINNER Victor
STINNER Victor added the comment: If you want to avoid the encoding errors, you can also use PYTHONIOENCODING=:replace or PYTHONIOENCODING=:backslashreplace in Python 3.4 to use the locale encoding, but use an error handler different than strict. --

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Sworddragon
Sworddragon added the comment: Using an environment variable is not the holy grail for this. On writing a non-single-user application you can't expect the user to set extra environment variables. If compatibility is the only reason in my opinion it would be much better to include something

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Using an environment variable is not the holy grail for this. On writing a non-single-user application you can't expect the user to set extra environment variables. I am not understanding why the user would have to set anything at all. What is the use case

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +ncoghlan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue19846 ___ ___ Python-bugs-list mailing

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Nick Coghlan
Nick Coghlan added the comment: Antoine's suggestion of being a little more aggressive in choosing utf-8 over ascii as the OS API encoding sounds reasonable to me. I think we're getting to a point where a system claiming ASCII as the encoding to use is almost certainly a misconfiguration

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a patch. $ LANG=C ./python -c import os, sys, locale; print(sys.getfilesystemencoding(), sys.stdin.encoding, os.device_encoding(0), locale.getpreferredencoding()) - Without the patch: ascii ANSI_X3.4-1968 ANSI_X3.4-1968 ANSI_X3.4-1968 - With the

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +lemburg, loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue19846 ___ ___

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread STINNER Victor
STINNER Victor added the comment: There was a previous try to use a file encoding different than the locale encoding and it introduces too many issues: https://mail.python.org/pipermail/python-dev/2010-October/104509.html Inconsistencies if locale and filesystem encodings are different Python

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Python uses the fact that the filesystem encoding is the locale encoding in various places. The patch doesn't change that. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue19846

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-07 Thread Nick Coghlan
Nick Coghlan added the comment: Note that the *only* change Antoine's patch makes is that: - *if* the locale encoding is ASCII (or an alias for ASCII) - *then* Python sets the filesystem encoding to UTF-8 instead If the locale encoding is anything *other* than ASCII, then that will still be

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-06 Thread Terry J. Reedy
Terry J. Reedy added the comment: Unless there is an actually possibility of changing this, which I doubt since it is a choice and not a bug, and changing might break things, this issue should be closed. -- nosy: +terry.reedy ___ Python tracker

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-12-06 Thread Antoine Pitrou
Antoine Pitrou added the comment: I think the ship has sailed on this. We can't change our heuristic everyone someone finds a flaw in the current one. In the long term, all sensible UNIX systems should be configured for utf-8 filenames and contents, so it won't make a difference anymore.

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-11-30 Thread Sworddragon
New submission from Sworddragon: It seems that print() and write() (and maybe other of such I/O functions) are relying on sys.getfilesystemencoding(). But these functions are not operating with filenames but with their content. In the attachments is an example script which demonstrates this

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-11-30 Thread R. David Murray
R. David Murray added the comment: Victor can correct me if I'm wrong, but I believe that stdin/stdout/stderr all use the filesystem encoding because filenames are the most likely source of non-ascii characters on those streams. (Not a perfect solution, but the best we can do.) --

[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

2013-11-30 Thread STINNER Victor
STINNER Victor added the comment: Filesystem encoding is not a good name. You should read OS encoding or maybe locale encoding. This encoding is the best choice for interopability with other (python2 or non python) programs. If you don't care of interoperabilty, force the encoding using