Serhiy Storchaka added the comment:
And yet, in Python 2, people could do that, and Python didn't care.
*That's* the regression I'm worried about. If it hadn't round-tripped
cleanly in Python 2, I wouldn't care here either.
$ python2.7 -c print u'\u20ac'
€
$ LANG=C python2.7 -c print
Serhiy Storchaka added the comment:
sworddragon@ubuntu:~$ LANG=C
sworddragon@ubuntu:~$ ä
bash: $'\303\244': command not found
- The terminal doesn't pseudo-crash with an exception because it doesn't
matter about encodings. - It allows to change the encoding at runtime.
This is not a
Marc-Andre Lemburg added the comment:
The C locale is part of the ANSI C standard. The POSIX locale is an alias
for the C locale and a POSIX standard, so we cannot just replace the ASCII
encoding with UTF-8 as we wish, so Antoine's patch won't work.
See e.g.
STINNER Victor added the comment:
I didn't understand Serhiy's ls example. I tried:
$ mkdir unicode
$ cd unicode
$ python3 -c 'open(ab\xe9.txt, w).close()'
$ python3 -c 'open(euro\u20ac.txt, w).close()'
$ ls
abé.txt euro€.txt
$ LANG=C ls
ab??.txt euro???.txt
Ah yes, I didn't remember that
STINNER Victor added the comment:
Nick testing applications for POSIX compliance
Sorry but what do you mean by POSIX compliance? The POSIX standard only
specify the ASCII encoding.
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
The tables in Locale Definition describe
STINNER Victor added the comment:
Marc-Andre AFAIK, Python 3 does work with ASCII data in the C locale, so I'm
not sure whether this is a bug at all.
What do you mean? Python uses the surrogateescape encoding since Python 3.1,
undecodable bytes are stored as surrogate characters.
Many bugs
Marc-Andre Lemburg added the comment:
On 09.12.2013 11:19, STINNER Victor wrote:
STINNER Victor added the comment:
Marc-Andre AFAIK, Python 3 does work with ASCII data in the C locale, so I'm
not sure whether this is a bug at all.
What do you mean? Python uses the surrogateescape
Changes by Nick Coghlan ncogh...@gmail.com:
--
title: print() and write() are relying on sys.getfilesystemencoding() instead
of sys.getdefaultencoding() - Setting LANG=C breaks Python 3
___
Python tracker rep...@bugs.python.org
Changes by STINNER Victor victor.stin...@gmail.com:
--
title: print() and write() are relying on sys.getfilesystemencoding() instead
of sys.getdefaultencoding() - Setting LANG=C breaks Python 3
___
Python tracker rep...@bugs.python.org
STINNER Victor added the comment:
Or said differently, the filesystem encoding is different than the
locale encoding.
Indeed, but the FS encoding and the IO encoding are the same.
locale encoding doesn't really matter here, as we are assuming that
it's wrong.
Oh, I realized that FS
Antoine Pitrou added the comment:
On dim., 2013-12-08 at 22:22 +, STINNER Victor wrote:
(b) for technical reasons, Python reuses the C codec during Python
initialization to decode and encode OS data, and so currently Python
*must* use the locale encoding for its filesystem encoding
Ahhh!
STINNER Victor added the comment:
It seems there is more work to do to get this right, but I'm not
terribly interested either. Feel free to take over.
If you are talking to me: I'm currently opposed to change anything, so I'm not
interested to work on a patch. IMO Python works fine and you
Nick Coghlan added the comment:
End users tripping over this by setting LANG=C is one of the pain points of
Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora
folks to the nosy list.
My current understanding of the situation:
- we should leave Windows and Mac OS X
STINNER Victor added the comment:
End users tripping over this by setting LANG=C is one of the pain points of
Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora
folks to the nosy list.
Sorry, I'm not aware of such issue. Do you have examples?
- the main problem is
Nick Coghlan added the comment:
On 9 December 2013 12:08, STINNER Victor rep...@bugs.python.org wrote:
STINNER Victor added the comment:
End users tripping over this by setting LANG=C is one of the pain points of
Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora
Sworddragon added the comment:
You should keep things more simple:
- Python and the operation system/filesystem are in a client-server
relationship and Python should validate all.
- It doesn't matter what you will finally decide to be the default encoding on
various places - all will provide
16 matches
Mail list logo