STINNER Victor added the comment:

Attached patch works around the CODESET issue on OpenIndiana and FreeBSD. If 
the LC_CTYPE locale is "C" and nl_langinfo(CODESET) returns ASCII (or an alias 
of this encoding), b"\xE9" is decoded from the locale encoding: if the result 
is U+00E9, the patch Python uses ISO-8859-1. (If decoding fails, the locale 
encoding is really ASCII, the workaround is not used.)

If the result is different (b'\xe9' is not decoded from the locale encoding to 
U+00E9), a ValueError is raised. I wrote this test to detect bugs. I hope that 
our buildbots will validate the code. We may choose a different behaviour (ex: 
keep ASCII).

Example on FreeBSD 8.2, original Python 3.4:

$ ./python
>>> import sys, locale
>>> sys.getfilesystemencoding()
'ascii'
>>> locale.getpreferredencoding()
'US-ASCII'

Example on FreeBSD 8.2, patched Python 3.4:

$ ./python 
>>> import sys, locale
>>> sys.getfilesystemencoding()
'iso8859-1'
>>> locale.getpreferredencoding()
'iso8859-1'

----------
keywords: +patch
Added file: http://bugs.python.org/file27965/workaround_codeset.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to