[issue5815] locale.getdefaultlocale() missing corner case
Roundup Robot added the comment: New changeset 3d805bee06e2 by Serhiy Storchaka in branch '2.7': Issue #5815: Fixed support for locales with modifiers. Fixed support for http://hg.python.org/cpython/rev/3d805bee06e2 New changeset 28883e89f335 by Serhiy Storchaka in branch '3.3': Issue #5815: Fixed support for locales with modifiers. Fixed support for http://hg.python.org/cpython/rev/28883e89f335 New changeset b50971bccfc3 by Serhiy Storchaka in branch 'default': Issue #5815: Fixed support for locales with modifiers. Fixed support for http://hg.python.org/cpython/rev/b50971bccfc3 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Committed without devanagari special case and tests. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: For devanagari modifier opened new issue20027. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
STINNER Victor added the comment: Buildbot failure: http://buildbot.python.org/all/builders/x86%20Gentoo%20Non-Debug%203.3/builds/1314/steps/test/logs/stdio == ERROR: test_locale_alias (test.test_locale.NormalizeTest) -- Traceback (most recent call last): File /var/lib/buildslave/3.3.murray-gentoo-wide/build/Lib/test/test_locale.py, line 374, in test_locale_alias with self.subTest(locale=(localename, alias)): AttributeError: 'NormalizeTest' object has no attribute 'subTest' -- resolution: fixed - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Roundup Robot added the comment: New changeset e0675408f4af by Serhiy Storchaka in branch '2.7': Don't use sebTest() in tests for issue #5815. http://hg.python.org/cpython/rev/e0675408f4af New changeset ed16f6695638 by Serhiy Storchaka in branch '3.3': Don't use sebTest() in tests for issue #5815. http://hg.python.org/cpython/rev/ed16f6695638 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Oh, thanks Victor. -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Marc-Andre, do you have comments or objections? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Marc-Andre Lemburg added the comment: On 18.12.2013 22:57, Serhiy Storchaka wrote: Marc-Andre, do you have comments or objections? Your last patch looks fine, but I don't have time to test it. Regarding the broken *devanagari* entries in the alias table: I think we should remove or correct those. The purpose of normalize() is to return a valid libc locale identifier and if the values in the alias table are clearly wrong and don't work with libc, there's little point in keeping them, even if the X11 file still lists them with the wrong notation. If we can fix them so that they do work with libc, let's do that. If we can't let's remove them. In both cases, please add a comment mentioning the case and why things were changed/removed. Hope that helps. Thanks. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Mike FABIAN added the comment: Serhiy While normalize can return sd...@devanagari.utf-8, _parse_localename() Serhiy should be able correctly parse it. But if normalize returns sd...@devanagari.utf-8, isn’t that quite useless because it is a locale name which does not actually work in glibc? Serhiy Removing sd...@devanagari.utf-8 from alias table is another issue. Yes. I think it should be fixed in the alias table as well. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Marc-Andre Lemburg added the comment: Then I don't understand changes such as: -'chinese-s':'zh_CN.eucCN', +'chinese-s':'zh_CN.gb2312', or -'sp': 'sr_CS.ISO8859-5', -'sp_yu':'sr_CS.ISO8859-5', +'sp': 'sr_RS.ISO8859-5', +'sp_yu':'sr_RS.ISO8859-5', The .test_locale_alias() checks that the normalize() function returns the the alias given in the alias table. As mentioned earlier, the purpose of the alias table is to map *normalized* local names to the C runtime string, which in some cases use different encoding names that we use in Python. It also test normalize(locale_alias[localname]) == locale_alias[localname] == normalize(localname). I.e. that applying normalize() twice doesn't change a result. That's not intended. The normalize() function is supposed to prepare the locale for the lookup. It's not supposed to be applied to the looked up value. About the devangari special case: This has been in the X11 file for ages and still is ... http://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/locale.alias.pre -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: That's not intended. The normalize() function is supposed to prepare the locale for the lookup. It's not supposed to be applied to the looked up value. Last patch doesn't contain this part of tests. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: There are no such systems really, in X.org this is just a mistake. glibc doesn’t write it like this and it is agains the specification here: While normalize can return sd...@devanagari.utf-8, _parse_localename() should be able correctly parse it. Removing sd...@devanagari.utf-8 from alias table is another issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Marc-Andre Lemburg added the comment: On 11.11.2013 20:21, Serhiy Storchaka wrote: That's not intended. The normalize() function is supposed to prepare the locale for the lookup. It's not supposed to be applied to the looked up value. Last patch doesn't contain this part of tests. Thanks. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Mike FABIAN added the comment: Serhiy, in your patch you seem to have special treatment for the devanagari modifier: +# Devanagari modifier placed before encoding. +return code, modifier.split('.')[1] Probably because of 'ks_in@devanagari': 'ks...@devanagari.utf-8', 'sd': 'sd...@devanagari.utf-8', in the locale_alias dictionary. But I think these two lines are just wrong, this mistake is inherited from the locale.alias from X.org where the python locale_alias comes from. glibc: mfabian@ari:~ $ locale -a | grep ^sd sd_IN sd_IN.utf8 sd_IN.utf8@devanagari sd_IN@devanagari mfabian@ari:~ $ locale -a | grep ^ks ks_IN ks_IN.utf8 ks_IN.utf8@devanagari ks_IN@devanagari mfabian@ari:~ $ The encoding should always be *before* the modifier. -- nosy: +mfabian ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: The /usr/share/X11/locale/locale.alias file in Ubuntu 12.04 LTS contains ks...@devanagari.utf-8 and sd...@devanagari.utf-8 entities. While the encoding is expected to be before the modifier, if there are systems with ks...@devanagari.utf-8 or sd...@devanagari.utf-8 locales we should support these weird case. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Mike FABIAN added the comment: Serhiy The /usr/share/X11/locale/locale.alias file in Ubuntu 12.04 LTS Serhiy contains ks...@devanagari.utf-8 and sd...@devanagari.utf-8 Serhiy entities. Yes, I know, that’s why I wrote that the Python code inherited this mistake from X.org. Serhiy While the encoding is expected to be before the modifier, if Serhiy there are systems with ks...@devanagari.utf-8 or Serhiy sd...@devanagari.utf-8 locales we should support these weird case. There are no such systems really, in X.org this is just a mistake. glibc doesn’t write it like this and it is agains the specification here: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html#tag_002 [language[_territory][.codeset][@modifier]] -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Mike FABIAN added the comment: In glibc, sd...@devanagari.utf-8 is an invalid locale name, only sd_IN.UTF-8@devanagari is valid: mfabian@ari:~ $ LC_ALL=sd_IN.UTF-8@devanagari locale charmap UTF-8 mfabian@ari:~ $ LC_ALL=sd...@devanagari.utf-8 locale charmap locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ANSI_X3.4-1968 mfabian@ari:~ $ So I think this should be fixed in X.org. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Ping. There are two duplicate issues opened last month. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Patch updated. Added tests. The locale_alias mapping updated to be self-consistency (i.e. for every name in locale_alias.values() normalize(name) == name). -- assignee: docs@python - serhiy.storchaka keywords: -easy nosy: +lemburg stage: needs patch - patch review versions: -Python 3.2 Added file: http://bugs.python.org/file31740/locale_parse_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
R. David Murray added the comment: It would be great if this could get a review by MAL, since it looks like a non-trivial change. Also, you have some (commented out) debug prints in there. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Marc-Andre Lemburg added the comment: On 13.09.2013 15:30, Serhiy Storchaka wrote: Serhiy Storchaka added the comment: Patch updated. Added tests. The locale_alias mapping updated to be self-consistency (i.e. for every name in locale_alias.values() normalize(name) == name). Could you elaborate on the alias changes ? Were those coming from an updated X11 local.alias file ? If so, I'd suggest to create two patches: one with the alias updates (which can then also be backported) and one with the new normalization code (which is a new feature and probably cannot be backported). Thanks, -- Marc-Andre Lemburg eGenix.com -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Also, you have some (commented out) debug prints in there. These debug prints were in old code. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
R. David Murray added the comment: Ah, I see. I only scanned the patch quickly, obviously. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Could you elaborate on the alias changes ? Were those coming from an updated X11 local.alias file ? No, they are not from X11 local.alias file. They are a result of the test_locale_alias self-test, I have fixed all failures. This test can't be backported without rest of changes, because they fix other error, for example processing encodings with hyphen. Without them test_locale_alias will fail even with updated locale_alias. I.e. we can backport either changes to locale_alias without tests, or changes to locale_alias with all changes to parser and tests, or changes to parser and all tests except test_locale_alias. Current code doesn't work with locales with modifiers and locales with hyphenated encodings. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Here is a patch without changes to locale_alias. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file31742/locale_parse_2a.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Marc-Andre Lemburg added the comment: On 13.09.2013 16:34, Serhiy Storchaka wrote: Serhiy Storchaka added the comment: Could you elaborate on the alias changes ? Were those coming from an updated X11 local.alias file ? No, they are not from X11 local.alias file. They are a result of the test_locale_alias self-test, I have fixed all failures. This test can't be backported without rest of changes, because they fix other error, for example processing encodings with hyphen. Without them test_locale_alias will fail even with updated locale_alias. I.e. we can backport either changes to locale_alias without tests, or changes to locale_alias with all changes to parser and tests, or changes to parser and all tests except test_locale_alias. Current code doesn't work with locales with modifiers and locales with hyphenated encodings. Then I don't understand changes such as: -'chinese-s':'zh_CN.eucCN', +'chinese-s':'zh_CN.gb2312', or -'sp': 'sr_CS.ISO8859-5', -'sp_yu':'sr_CS.ISO8859-5', +'sp': 'sr_RS.ISO8859-5', +'sp_yu':'sr_RS.ISO8859-5', The .test_locale_alias() checks that the normalize() function returns the the alias given in the alias table. If you have to make changes to the alias table that cause the encoding to or locale to change, something is wrong with normalize() function. Note that we are using the X11 locale.alias file as basis for the mapping, so any such changes need to be found there as well. The Tools/i18n/makelocalealias.py script can be used to create an updated listing. Please remember that the output of the alias table is a C runtime locale string. Those do not necessarily use the same encodings as we do in Python. Perhaps we should open a separate ticket for the update of the alias table. I just ran the script on my older dev system and it returned this list of changes compared to what's in Python 2.7: #added 'ar_in' #added 'as_in' #added 'be_bg' #added 'bo_in' #added 'en_dk' #added 'hne_in' #added 'ks_in' #added 'mai_in' #added 'ml_in' #added 'ne_np' #added 'or_in' #added 'pa_pk' #added 'sd_in' #added 'sd_in@devanagari' #added 'te_in' #updated 'univ' - 'en_US.utf' to 'en_US.UTF-8' #added 'ur_in' #added 'zh_sg' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka added the comment: Then I don't understand changes such as: -'chinese-s':'zh_CN.eucCN', +'chinese-s':'zh_CN.gb2312', or -'sp': 'sr_CS.ISO8859-5', -'sp_yu':'sr_CS.ISO8859-5', +'sp': 'sr_RS.ISO8859-5', +'sp_yu':'sr_RS.ISO8859-5', The .test_locale_alias() checks that the normalize() function returns the the alias given in the alias table. It also test normalize(locale_alias[localname]) == locale_alias[localname] == normalize(localname). I.e. that applying normalize() twice doesn't change a result. chinese-s is mapped to zh_CN.eucCN, but eucCN is mapped to gb2312. sp is mapped to sr_CS.ISO8859-5, but sr_CS is mapped to sr_RS.UTF-8 and then .ISO8859-5 replaces UTF-8. Of course we can recursive call normalize(), but it will be more practical just update the mapping. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Dmitry Jemerov added the comment: A related issue (a case which isn't taken into account by Serhiy's patch) is http://bugs.python.org/issue18378 -- nosy: +Dmitry.Jemerov ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka storch...@gmail.com added the comment: Here is yet some inconsistency: $ LANG=uk_ua.microsoftcp1251 ./python -c import locale; print(locale.getdefaultlocale()) ('uk_UA', 'CP1251') $ LANG=uk_ua.microsoft-cp1251 ./python -c import locale; print(locale.getdefaultlocale()) ('uk_UA', 'microsoft_cp1251') $ ./python -c import locale; print(locale.normalize('ka_ge.georgianacademy')) ka_GE.GEORGIAN-ACADEMY $ ./python -c import locale; print(locale.normalize('ka_GE.GEORGIAN-ACADEMY')) ka_GE.georgian_academy -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka storch...@gmail.com added the comment: Here is a complex patch for more careful locale parsing. -- Added file: http://bugs.python.org/file26380/locale_parse.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
rg3 sarbalap+freshm...@gmail.com added the comment: I don't know if the behavior is considered a bug or just undocumented, but under Python 2.7.3 it's still the same. locale.getpreferredencoding() does return UTF-8, but the second element in the tuple locale.getdefaultlocale() is utf_8_valencia, which is not a valid encoding despite the documentation saying it's supposed to be an encoding name. From my terminal: $ python -V Python 2.7.3 $ LANG=ca_ES.UTF-8@valencia python -c 'import locale; print locale.getpreferredencoding()' UTF-8 $ LANG=ca_ES.UTF-8@valencia python -c 'import locale; print locale.getdefaultlocale()' ('ca_ES', 'utf_8_valencia') $ LANG=ca_ES.UTF-8 python -c 'import locale; print locale.getpreferredencoding()' UTF-8 $ LANG=ca_ES.UTF-8 python -c 'import locale; print locale.getdefaultlocale()' ('ca_ES', 'UTF-8') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Serhiy Storchaka storch...@gmail.com added the comment: The patch is not work for ca_ES@valencia locale. And there are issues for such locales: ks_in@devanagari, ks...@devanagari.utf-8, sd, sd...@devanagari.utf-8 (ks_in@devanagari in locale_alias maps to ks...@devanagari.utf-8 and sd to sd...@devanagari.utf-8). -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Greg Roodt gro...@gmail.com added the comment: Bumping this as part of a bug scrub at EuroPython. Is this still an issue? Should we fix in docs or in code? -- nosy: +groodt ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
Changes by Ezio Melotti ezio.melo...@gmail.com: -- keywords: +easy versions: +Python 3.2, Python 3.3 -Python 2.6, Python 3.0, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
New submission from rg3 sarbalap+freshm...@gmail.com: A recent issue with one of my programs has shown that locale.getdefaultlocale() does not handle correctly a corner case. The issue URL is this one: http://bitbucket.org/rg3/youtube-dl/issue/7/ Essentially, some users have LANG set to something like es_ca.ut...@valencia. In that case, locale.getdefaultlocale() returns, as the encoding, the string utf_8_valencia, which cannot be used as an argument to the string encode() function. The obvious correct encoding in this case is UTF-8. I have traced the problem and it seems that it could be fixed by the attached patch. It checks if the encoding, at that point, contains the '@' symbol and, in that case, removes everything starting at that point, leaving only UTF-8. I am not sure if this patch or a similar one should be applied to other Python versions. My system has Python 2.5.2 and that's what I have patched. Explanation as to why I put the code there: * The simple case, es_CA.UTF-8 goes through that point too and enters the if. * I wanted to remove what goes after the '@' symbol at that point, so it either needed to be removed before the call to the normalizing function or inside the normalization. * As this is not what I would consider a normalization, I put the code before the function call. Thanks for your hard work. I hope my patch is valid. Regards. -- components: Library (Lib) files: locale.diff keywords: patch messages: 86312 nosy: rg3 severity: normal status: open title: locale.getdefaultlocale() missing corner case type: behavior versions: Python 2.5 Added file: http://bugs.python.org/file13737/locale.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
rg3 sarbalap+freshm...@gmail.com added the comment: I just realized that the if I introduced is not really needed. encoding = encoding.split('@')[0] works whether the '@' symbol is present or not. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
R. David Murray rdmur...@bitdance.com added the comment: I wasn't able to reproduce this by just setting my LC_ALL environment variable to es_ca.ut...@valencia and calling getdefaultlocale. Can you provide more complete steps to reproduce? -- nosy: +r.david.murray priority: - normal stage: - test needed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
rg3 sarbalap+freshm...@gmail.com added the comment: You are right. The issue is not reproduced with es_ca.ut...@valencia but with ca_es.ut...@valencia. The fact that the first case works makes me think maybe there's another way to solve the problem. Can you check that? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
rg3 sarbalap+freshm...@gmail.com added the comment: Further investigation: The guy who had this issue may be from Valencia, Spain. According to the manpage for setlocale(3) in my system, the form is usually language[_territory][.codese...@modifier]. So, in this case, it would make sense for the language to be ca (Catalan) and territory ES (Spain). My patch may be fine after all. Because, if at that point the @modifier is still present (I have seen code that removes it before that point), you'd still want to remove it and keep only the codeset, which is the interesting part. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
R. David Murray rdmur...@bitdance.com added the comment: OK, it turns out that this is one of a class of known bugs of long standing (see issue554676 and issue1080864, for example). The recommended solution is to not use locale.getdefaultlocale, but to use locale.getperferredencoding. I have confirmed that that works for the case of ca_es.ut...@valencia in python2.5. There is at least a doc bug here, since no mention of this fragility/recommendation is made in the getdefaultlocale documentation. Using getpreferredencoding seems to be the correct solution to your problem. However, the locale.py module contains a number of examples of modifiers in the locale_alias table. Presumably this case could be added, but it is not clear to me what the policy is on that at this time, so I'm adding Martin to the nosy list looking for some guidance. -- assignee: - georg.brandl components: +Documentation nosy: +georg.brandl, loewis stage: test needed - needs patch versions: +Python 2.6, Python 2.7, Python 3.0, Python 3.1 -Python 2.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5815] locale.getdefaultlocale() missing corner case
rg3 sarbalap+freshm...@gmail.com added the comment: Excellent. Thanks for the tip. I'll now proceed to modify my code to use getpreferredencoding. Still, I think getdefaultlocale should work because it could be used in other situations, I suppose. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5815 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com