[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Roundup Robot devnull@devnull added the comment: New changeset 932de36903e7 by Ronald Oussoren in branch '2.7': (backport)Fix #10154 and #10090: locale normalizes the UTF-8 encoding to UTF-8 instead of UTF8 http://hg.python.org/cpython/rev/932de36903e7 New changeset 28e410eb86af by Ronald Oussoren in branch '3.1': Fix #10154 and #10090: locale normalizes the UTF-8 encoding to UTF-8 instead of UTF8 http://hg.python.org/cpython/rev/28e410eb86af New changeset 454d13e535ff by Ronald Oussoren in branch '3.2': (merge) Fix #10154 and #10090: locale normalizes the UTF-8 encoding to UTF-8 instead of UTF8 http://hg.python.org/cpython/rev/454d13e535ff -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by Ronald Oussoren ronaldousso...@mac.com: -- resolution: - fixed stage: needs patch - committed/rejected ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by Ronald Oussoren ronaldousso...@mac.com: -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Roundup Robot devnull@devnull added the comment: New changeset 3d7cb852a176 by Ronald Oussoren in branch 'default': Fix for issue 10154, merge from 3.2 http://hg.python.org/cpython/rev/3d7cb852a176 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Ronald Oussoren ronaldousso...@mac.com added the comment: The attached patch implements the change that Marc-Andre proposed. I intend to apply this patch to all active branches later today (after some more testing) -- keywords: +patch Added file: http://bugs.python.org/file21916/issue10154.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by STINNER Victor victor.stin...@haypocalc.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Marc-Andre Lemburg m...@egenix.com added the comment: Piotr Sikora wrote: Piotr Sikora piotr.sik...@frickle.com added the comment: It's the same on OpenBSD (and I'm pretty sure it's true for other BSDs as well). locale.resetlocale() Traceback (most recent call last): File stdin, line 1, in module File /usr/local/lib/python2.6/locale.py, line 523, in resetlocale _setlocale(category, _build_localename(getdefaultlocale())) locale.Error: unsupported locale setting locale._build_localename(locale.getdefaultlocale()) 'en_US.UTF8' Works fine with Marc-Andre's alias table fix. Any chances this will be eventually fixed in 2.x? This can go into Python 2.7, and, of course, into the 3.x branches. -- title: locale.normalize strips - from UTF-8, which fails on Mac - locale.normalize strips - from UTF-8,which fails on Mac ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by Éric Araujo mer...@netwok.org: -- stage: - needs patch title: locale.normalize strips - from UTF-8, which fails on Mac - locale.normalize strips - from UTF-8, which fails on Mac versions: +Python 3.3 -Python 2.5, Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Piotr Sikora piotr.sik...@frickle.com added the comment: It's the same on OpenBSD (and I'm pretty sure it's true for other BSDs as well). locale.resetlocale() Traceback (most recent call last): File stdin, line 1, in module File /usr/local/lib/python2.6/locale.py, line 523, in resetlocale _setlocale(category, _build_localename(getdefaultlocale())) locale.Error: unsupported locale setting locale._build_localename(locale.getdefaultlocale()) 'en_US.UTF8' Works fine with Marc-Andre's alias table fix. Any chances this will be eventually fixed in 2.x? -- nosy: +PiotrSikora ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Boris FELD lothiral...@gmail.com added the comment: Bug confirmed on python2.5+ and python3.2-. If it works with the dash, is agree with the Marc-Andre solution. -- nosy: +Boris.FELD versions: +Python 2.5, Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
MunSic JEONG rus...@gmail.com added the comment: Ubuntu 10.4.1 LTS also work fine with both UTF8 and UTF-8 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Ronald Oussoren ronaldousso...@mac.com added the comment: UTF-8 works on SuSE Enterprise Linux 9 and 10 as well. BTW, neither UTF8 nor UTF-8 work on HPUX 10. That platform requires spelling it as utf8. This sadly enought means that this code doesn't work on HPUX 10: locale.setlocale(locale.LC_ALL, locale.getdefaultlocale()) Traceback (most recent call last): File stdin, line 1, in module File /opt/python2.7/lib/python2.7/locale.py, line 531, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting That's because getdefaultlocale returns 'UTF8' as the encoding, even though LANG is set to 'nl_NL.utf8' (which is a working locale on the machine I tested). BTW. I'm +1 on changing the alias table as Marc-Andre proposed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Antoine Pitrou pit...@free.fr added the comment: Mandriva and Debian also work fine with both UTF8 and UTF-8. For the record, the canonical spelling inside /usr/share/locale is UTF-8. I suppose glibc does its own normalization. -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by MunSic JEONG rus...@gmail.com: -- nosy: +ruseel ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Stephen Hansen me+pyt...@ixokai.io added the comment: Mark, the locals() right before if encoding: (line 399) are: locale.normalize(en_US.UTF-8) {'code': 'en_US.ISO8859-1', 'langname': 'en_US', 'encoding': 'UTF8', 'norm_encoding': 'utf_8', 'defenc': 'ISO8859-1', 'localename': 'en_US.UTF-8', 'lookup_name': 'en_us.utf-8', 'fullname': 'en_us.utf-8'} 'en_US.UTF8' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Marc-Andre Lemburg m...@egenix.com added the comment: Stephen Hansen wrote: Stephen Hansen me+pyt...@ixokai.io added the comment: Mark, the locals() right before if encoding: (line 399) are: locale.normalize(en_US.UTF-8) {'code': 'en_US.ISO8859-1', 'langname': 'en_US', 'encoding': 'UTF8', 'norm_encoding': 'utf_8', 'defenc': 'ISO8859-1', 'localename': 'en_US.UTF-8', 'lookup_name': 'en_us.utf-8', 'fullname': 'en_us.utf-8'} 'en_US.UTF8' Thanks. Line 646 in the alias table is wrong: 'utf_8':'UTF8', should read: 'utf_8':'UTF-8', I wonder why this wasn't reported earlier - did the GlibC change the UTF-8 spelling at some point ? I do vaguely remember that I had to remove the hyphen due to problems with setlocale() not accepting 'UTF-8', but that was at the time I wrote that part of locale.py, i.e. many years ago. It doesn't appear to be necessary anymore. I checked on openSUSE 10.3 and 11.3. Both work fine with 'UTF-8' and 'UTF8'. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Georg Brandl ge...@python.org added the comment: If other Posix-y systems accept both spellings and only Macs insist on the dash, we should probably indeed change the alias entry to use it. -- nosy: +georg.brandl ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
New submission from Stephen Hansen me+pyt...@ixokai.io: In the course of investigating issue10092, Georg discovered that the behavior of locale.normalize() on Mac is bad. Basically, en_US.UTF-8 is how the correct locale string should be spelled on the Mac. If you drop the dash, it fails: which locale.normalize does, so you can't pass the return value of the function to setlocale, even though that's what its documented to be for. If that isn't clear, this should demonstrate (from /branches/py3k): Top-2:build pythonbuildbot$ ./python.exe Python 3.2a3+ (py3k:85631, Oct 17 2010, 06:45:22) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import locale [51767 refs] locale.normalize(en_US.UTF-8) 'en_US.UTF8' [51770 refs] locale.setlocale(locale.LC_TIME, 'en_US.UTF8') Traceback (most recent call last): File stdin, line 1, in module File /Users/pythonbuildbot/test/build/Lib/locale.py, line 538, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting [51816 refs] locale.setlocale(locale.LC_TIME, 'en_US.UTF-8') 'en_US.UTF-8' [51816 refs] The precise same behavior exists on my stock/system Python 2.6, too, fwiw. (Not that it can be fixed on 2.6, but maybe 2.7?) -- assignee: ronaldoussoren components: Library (Lib), Macintosh messages: 119213 nosy: ixokai, ronaldoussoren priority: normal severity: normal status: open title: locale.normalize strips - from UTF-8, which fails on Mac type: behavior versions: Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Ronald Oussoren ronaldousso...@mac.com added the comment: This patch solves the immediate failure: Index: Lib/locale.py === --- Lib/locale.py (revision 85743) +++ Lib/locale.py (working copy) @@ -396,6 +396,9 @@ else: encoding = defenc #print 'found encoding %r' % encoding +if sys.platform == 'darwin' and encoding == 'UTF8': +encoding = 'UTF-8' + if encoding: return langname + '.' + encoding else: I'm not happy about hardcoding this specific exception though, there should be a better solution than this. Ronald -- Added file: http://bugs.python.org/file19300/smime.p7s ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ smime.p7s Description: S/MIME cryptographic signature ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Changes by Ronald Oussoren ronaldousso...@mac.com: Removed file: http://bugs.python.org/file19300/smime.p7s ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10154] locale.normalize strips - from UTF-8, which fails on Mac
Marc-Andre Lemburg m...@egenix.com added the comment: Ronald Oussoren wrote: Ronald Oussoren ronaldousso...@mac.com added the comment: This patch solves the immediate failure: Index: Lib/locale.py === --- Lib/locale.py (revision 85743) +++ Lib/locale.py (working copy) @@ -396,6 +396,9 @@ else: encoding = defenc #print 'found encoding %r' % encoding +if sys.platform == 'darwin' and encoding == 'UTF8': +encoding = 'UTF-8' + if encoding: return langname + '.' + encoding else: I'm not happy about hardcoding this specific exception though, there should be a better solution than this. Could you tell me the values of localename, code, langname and encoding at that step in the process ? We may need to add an locale_encoding_alias from 'UTF8' to 'UTF-8', since the version with the hyphen is what the C lib uses. -- nosy: +lemburg title: locale.normalize strips - from UTF-8, which fails on Mac - locale.normalize strips - from UTF-8,which fails on Mac ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10154 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com