[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: ... So Antoine and Martin: which encoding do you prefer? I still propose to drop the fsname encoding. Then this question goes away. You mean that we should use the following encoding for the command line arguments, environment

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: I like this solution because it doesn't change a lot of things. I agree to drop PYTHONFSENCODING because it looks like PYTHONFSENCODING introduced more inconsistencies than it solved. If you remove the

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: You mean that we should use the following encoding for the command line arguments, environment variables and all filenames/paths: - Mac OS X: utf-8 - Windows: unicode for command line/env, mbcs to decode filenames No: unicode for

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: If you remove both, Python will get very poor grades for OS interoperability on platforms that often deal with multiple different encodings for file names. Why that? It will work very well in such a setting, much better than, say, Java.

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Ronald Oussoren
Ronald Oussoren ronaldousso...@mac.com added the comment: On 09 Oct, 2010,at 02:07 PM, Antoine Pitrou rep...@bugs.python.org wrote: Antoine Pitrou pit...@free.fr added the comment: For the command line, it would mean that we introduced a new encoding: command line encoding, which will be

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: There is one reason for not wanting to assume that the encoding is always UTF-8: the user might access the system from a non-UTF8 terminal (such as when logging in with an SSH session from a system not using UTF-8, or using an alternate

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: If you remove both, Python will get very poor grades for OS interoperability on platforms that often deal with multiple different encodings for file names.

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: Being pedantic about forcing some encoding onto things that don't have an encoding won't really work out in practice. Dealing with file names, OS environments, pipes and sockets is dirty work, so I think we should go with the 80-20

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: However, I completely fail to see the advantage that the PYTHONFSENCODING variable has over the LANG variable. If it's possible to set PTHONFSENCODING in some application, it surely is also possible to set LANG (or LC_CTYPE), no? Setting the

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: Being pedantic about forcing some encoding onto things that don't have an encoding won't really work out in practice. Dealing with file names, OS

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: MvL - Windows: unicode for command line/env, mbcs to decode filenames MvL No: unicode for filenames also. Yes, I mean unicode for everything, but decode bytes data from the mbcs encoding. --

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: MAL If you remove the PYTHONFSENCODING, then we have to reconsider MAL removal of sys.setfilesystemencoding(). Plase, Marc, read my comments. You never consider technical problems, you just propose to ensure that Python

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: You can't possibly expect a user to switch to using UTF-8 for all his/her applications just because Python needs this to properly decode file names. If the user hasn't switched to UTF-8, why would Python need that to properly decode file

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: MAL You can't just tell people to go with whatever encoding setup MAL you prefer to make Python's guessing easier or more correct. Python doesn't really *guess* the encoding, it just reads the encoding from the locale. What do you

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I guess LANG and LC_CTYPE can be used for other purposes such as internationalization. That's why there are different environement variables: * LC_MESSAGES for i18n (messages) * LC_CTYPE for the encoding * LC_TIME for time and

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: issue9992.patch: - Remove PYTHONFSENCODING environment variable - Mac OS X: Use utf-8 to decode command line arguments - Fix issue #9992 (this issue): attached test, locale_fs_encoding.py, pass - Fix issue #9988 - Fix issue

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-11 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I think that issue9992.patch fixes also #4388 because it uses the same encoding (FS encoding, utf8) on OSX to encode and to decode command line arguments. -- ___ Python tracker

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: We run into problems because we have two inconsistent encodings, ... What? No. We have problems because we don't use the same encoding to decode and to encode the same data type. It's not a problem to use a different encoding

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: Am 10.10.2010 17:51, schrieb STINNER Victor: STINNER Victor victor.stin...@haypocalc.com added the comment: We run into problems because we have two inconsistent encodings, ... What? No. We have problems because we don't use the

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: What? No. We have problems because we don't use the same encoding to decode and to encode the same data type. It's not a problem to use a different encoding for each data type (stdout, filenames, environment variables, ...).

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: For the command line arguments and environment variables, we don't have a lot of choices: locale or filesystem encodings. So Antoine and Martin: which encoding do you prefer? I still propose to drop the fsname encoding. Then this

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Le dimanche 10 octobre 2010 à 18:23 +, Martin v. Löwis a écrit : Martin v. Löwis mar...@v.loewis.de added the comment: For the command line arguments and environment variables, we don't have a lot of choices: locale or filesystem

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-10 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: I don't know what you mean by dropping, since OS X by construction needs a filesystem encoding (utf-8) different from the locale encoding; See above. I propose to stop using the locale encoding for command line arguments and environment

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Perhaps. We could also declare that command line arguments and environment variables are always UTF-8-encoded on OSX (which I think would be fairly accurate) Python uses the filesystem encoding to encode/decode environment

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: For the command line, it would mean that we introduced a new encoding: command line encoding, which will be utf-8 on OSX. Or more generally environment encoding, if it's also used for env vars. This could solve the subprocess issue neatly.

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: So perhaps it would be best if Python had two external default encodings: the IO one (command line arguments, environment variables, text files), and the file name encoding (defaulting to the IO encoding if not set) Hum, I prefer

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: Am 09.10.2010 14:07, schrieb Antoine Pitrou: Antoine Pitrou pit...@free.fr added the comment: For the command line, it would mean that we introduced a new encoding: command line encoding, which will be utf-8 on OSX. Or more

[issue9992] Command line arguments are not correctly decodediflocale and fileystem encodingsaredifferent

2010-10-09 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Please no. We run into problems because we have two inconsistent encodings, and now you propose to introduce another one, allowing for even more inconsistencies??? It would not really be a third encoding, since it would replace the locale