[issue1813] Codec lookup failing under turkish locale
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset a55ffb6c1993 by Stefan Krah in branch '3.2': Issue #1813: Revert workaround for a glibc bug on the Fedora buildbot. http://hg.python.org/cpython/rev/a55ffb6c1993 New changeset 4244e4348362 by Stefan Krah in branch 'default': Issue #1813: merge changeset that reverts a glibc workaround for the http://hg.python.org/cpython/rev/4244e4348362 New changeset 0b8917fc6db5 by Stefan Krah in branch '2.7': Issue #1813: backport changeset that reverts a glibc workaround for the http://hg.python.org/cpython/rev/0b8917fc6db5 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: I've upgraded the Fedora buildbot to Fedora-16. The specific glibc workaround should not be necessary any more. So the test will now fail again on all systems that a) have the bug and b) the tr_Tr locale. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: https://bugzilla.redhat.com/show_bug.cgi?id=726536 claims that the glibc issue (which is relevant for skipping the test case) is fixed in glibc-2.14.90-8. I suspect the only way of running the test case reliably is whitelisting a couple of known good glibc versions. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: Unrelated to the Fedora issue: The test is currently skipped on the FreeBSD bot, but completes successfully with: diff -r 0b52b6f1bfab Lib/test/test_locale.py --- a/Lib/test/test_locale.py Tue Aug 02 10:16:45 2011 +0200 +++ b/Lib/test/test_locale.py Tue Aug 02 11:37:39 2011 +0200 @@ -399,7 +399,7 @@ oldlocale = locale.setlocale(locale.LC_CTYPE) self.addCleanup(locale.setlocale, locale.LC_CTYPE, oldlocale) try: -locale.setlocale(locale.LC_CTYPE, 'tr_TR') +locale.setlocale(locale.LC_CTYPE, 'tr_TR.UTF-8') except locale.Error: # Unsupported locale on this system self.skipTest('test needs Turkish locale') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: As I wrote on python-dev, this test also fails on Debian lenny, which has the same setlocale() bug as Fedora. So, indeed the test should be skipped on a multitude of platforms. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
R. David Murray rdmur...@bitdance.com added the comment: On Tue, 02 Aug 2011 12:12:37 +0200, Stefan Krah ste...@bytereef.org wrote: I suspect many buildbots are green because they don't have tr_TR and tr_TR.iso8859-9 installed. This is true for my Gentoo buildbots. Once we've figured out the best way to handle this, I'll fix that (install the other locales) for my two. When I run the C test program I get null as the final output of that regardless of whether I use 'tr_TR' or 'tr_TR.utf8'. This is with glibc-2.13-r2 (the r2 is Gentoo's mod number). As someone pointed out on python-dev, if this isn't fixable then it should be an expected failure, not a skip. One question is, is there any platform on which the turkish locale is installed where this test actually works? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: [Re-opening to fix the skips] Yes, the test works on: Ubuntu Lucid (libc-2.11.1), OpenSUSE (libc-2.11.1), FreeBSD-8.2 Failure: Fedora 14 (libc-2.13), Debian lenny (libc-2.7), Gentoo (libc-2.13-r2) So perhaps this test should be marked as expected failure on Linux altogether (unless we test for the libc version). -- status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou pit...@free.fr added the comment: As someone pointed out on python-dev, if this isn't fixable then it should be an expected failure, not a skip. The Python bug is fixed, the problem is apparently some libcs have the same bug as we did... One question is, is there any platform on which the turkish locale is installed where this test actually works? Well, it works here (Mageia). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: Fedora bug report: https://bugzilla.redhat.com/show_bug.cgi?id=726536 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
R. David Murray rdmur...@bitdance.com added the comment: I'm seeing this test failure in Gentoo, as well. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: The Fedora bot fails because here ... locale.setlocale(locale.LC_CTYPE, loc) loc = ('tr_TR', 'ISO8859-9'), and apparently setlocale can only handle tr_TR, but not tr_TR.ISO8859-9: 144 if (locale) { 145 /* set locale */ 146 result = setlocale(category, locale); 147 if (!result) { 148 /* operation failed, no setting was changed */ 149 PyErr_SetString(Error, unsupported locale setting); 150 return NULL; (gdb) p result = setlocale(category, tr_TR.ISO8859-9) $8 = 0x0 (gdb) p result = setlocale(category, tr_TR) $9 = 0x96d770 tr_TR (gdb) p locale $10 = 0x70f6a5b0 tr_TR.ISO8859-9 (gdb) -- nosy: +skrah ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: Stefan Krah rep...@bugs.python.org wrote: (gdb) p result = setlocale(category, tr_TR.ISO8859-9) $8 = 0x0 (gdb) p result = setlocale(category, tr_TR) $9 = 0x96d770 tr_TR (gdb) p locale $10 = 0x70f6a5b0 tr_TR.ISO8859-9 (gdb) Perhaps this is a bug in Fedora's setlocale that can't handle the turkish 'I' in 'ISO' when CTYPE is turkish. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou pit...@free.fr added the comment: Stefan Krah rep...@bugs.python.org wrote: (gdb) p result = setlocale(category, tr_TR.ISO8859-9) $8 = 0x0 (gdb) p result = setlocale(category, tr_TR) $9 = 0x96d770 tr_TR (gdb) p locale $10 = 0x70f6a5b0 tr_TR.ISO8859-9 (gdb) Perhaps this is a bug in Fedora's setlocale that can't handle the turkish 'I' in 'ISO' when CTYPE is turkish. Perhaps indeed. Maybe you should try to report it. It does look like an OS bug in any case. (fortunately that buildbot is in the unstable bunch :-)) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Stefan Krah stefan-use...@bytereef.org added the comment: Yes, it's a bug. This works: #include stdio.h #include locale.h int main(void) { char *s; printf(%s\n, setlocale(LC_CTYPE, tr_TR.ISO8859-9)); printf(%s\n, setlocale(LC_CTYPE, NULL)); s = setlocale(LC_CTYPE, tr_TR.ISO8859-9); printf(%s\n, s ? s : null); return 0; } But when I change the first setlocale call to tr_TR, the result of the last call is NULL. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 92d02de91cc9 by Antoine Pitrou in branch '3.2': Issue #1813: Fix codec lookup under Turkish locales. http://hg.python.org/cpython/rev/92d02de91cc9 New changeset a77a4df54b95 by Antoine Pitrou in branch '3.2': Add a test for issue #1813: getlocale() failing under a Turkish locale http://hg.python.org/cpython/rev/a77a4df54b95 New changeset fe0caf8c48d2 by Antoine Pitrou in branch 'default': Add a test for issue #1813: getlocale() failing under a Turkish locale http://hg.python.org/cpython/rev/fe0caf8c48d2 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 739958134fe5 by Antoine Pitrou in branch '2.7': Issue #1813: Fix codec lookup and setting/getting locales under Turkish locales. http://hg.python.org/cpython/rev/739958134fe5 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou pit...@free.fr added the comment: Finally fixed in 2.7, 3.2, 3.3! -- resolution: - fixed stage: - committed/rejected status: open - closed versions: +Python 3.3 -Python 2.6, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
STINNER Victor victor.stin...@haypocalc.com added the comment: The decimal module has been fixed in Python 2.7, 3.2 and 3.3 for Turkish local: issue #11830. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- nosy: +Arfrever ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Gökçen Eraslan gok...@pardus.org.tr: -- nosy: +Gökçen.Eraslan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Dirkjan Ochtman dirk...@ochtman.nl added the comment: We've included this patch in Gentoo for about two years now. Can we get some discussion going on doing something like this? -- nosy: +djc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Marc-Andre Lemburg m...@egenix.com added the comment: Looking at this again, I think we should change the codec registry C code to use Py_TOLOWER() and the encoding search function code to use the .translate() approach that Antoine suggested. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
STINNER Victor victor.stin...@haypocalc.com added the comment: There is also a locale normalization function in unicodeobject.c: normalize_encoding(). This function uses if (ISUPPER(*e)) *l++ = TOLOWER(*e++); which uses the Python, *locale-independent*, implementation of ctype. We should maybe use the ISUPPER / TOLOWER in codecs.c. Anyway, a function should be fixed, but I don't know which one :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Mark Lawrence breamore...@yahoo.co.uk added the comment: Does anyone know if this was discussed on python-dev? I've tried searching the archives and didn't find anything, but that's not to say it isn't there. -- nosy: +BreamoreBoy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +haypo versions: +Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Jakub Wilk jw...@jwilk.net: -- nosy: +jwilk ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1813 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: Sean: I'd suggest to discuss this on python-dev. Note that even if we do use Unicode for the cases in question, the Turkish locale will still pose a problem - see #1528802 for a discussion. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Sean Reifschneider [EMAIL PROTECTED] added the comment: Marc-Andre: How should we proceed with this bug? Discuss on python-dev or c.l.python? -- assignee: - lemburg keywords: +patch nosy: +jafo priority: - normal __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou added the comment: The C library's tolower() and toupper() are used in a handful of source files. It might make sense to replace some of those calls with ascii-only versions of the corresponding functions. Modules/_sre.c:return ((ch) 256 ? (unsigned int)tolower((ch)) : ch); Modules/_sqlite/cursor.c:*dst++ = tolower(*src++); Modules/stropmodule.c: *s_new = tolower(c); Modules/stropmodule.c: *s_new = toupper(c); Modules/stropmodule.c: *s_new = toupper(c); Modules/stropmodule.c: *s_new = tolower(c); Modules/stropmodule.c: *s_new = toupper(c); Modules/stropmodule.c: *s_new = tolower(c); Modules/unicodedata.c:h = (h * scale) + (unsigned char) toupper(Py_CHARMASK(s[i])); Modules/unicodedata.c:if (toupper(Py_CHARMASK(name[i])) != buffer[i]) Modules/_tkinter.c: argv0[0] = tolower(Py_CHARMASK(argv0[0])); Modules/binascii.c: c = tolower(c); Objects/stringobject.c: s[i] = _tolower(c); Objects/stringobject.c: s[i] = _toupper(c); Objects/stringobject.c: c = toupper(c); Objects/stringobject.c: c = tolower(c); Objects/stringobject.c: *s_new = toupper(c); Objects/stringobject.c: *s_new = tolower(c); Objects/stringobject.c: *s_new = toupper(c); Objects/stringobject.c: *s_new = tolower(c); Parser/tokenizer.c: else buf[i] = tolower(c); Python/codecs.c:ch = tolower(Py_CHARMASK(ch)); Python/dynload_win.c: first = tolower(*string1); Python/dynload_win.c: second = tolower(*string2); Python/pystrcmp.c: while ((--size 0) (tolower(*s1) == tolower(*s2))) { Python/pystrcmp.c: return tolower(*s1) - tolower(*s2); Python/pystrcmp.c: while (*s1 (tolower(*s1++) == tolower(*s2++))) { Python/pystrcmp.c: return (tolower(*s1) - tolower(*s2)); __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou added the comment: As for the .upper() and .lower() methods, they are used in quite a bunch of standard library modules :-/... Lib/base64.py Lib/BaseHTTPServer.py Lib/bsddb/test/test_compare.py Lib/bsddb/test/test_dbobj.py Lib/CGIHTTPServer.py Lib/cgi.py Lib/compiler/ast.py Lib/ConfigParser.py Lib/cookielib.py Lib/Cookie.py Lib/csv.py Lib/ctypes/test/test_byteswap.py Lib/ctypes/util.py Lib/decimal.py Lib/distutils/command/bdist_rpm.py Lib/distutils/command/bdist_wininst.py Lib/distutils/command/register.py Lib/distutils/msvc9compiler.py Lib/distutils/msvccompiler.py Lib/distutils/sysconfig.py Lib/distutils/tests/test_dist.py Lib/distutils/util.py Lib/email/charset.py Lib/email/encoders.py Lib/email/header.py Lib/email/__init__.py Lib/email/message.py Lib/email/_parseaddr.py Lib/email/test/test_email.py Lib/email/test/test_email_renamed.py Lib/encodings/idna.py Lib/encodings/punycode.py Lib/formatter.py Lib/ftplib.py Lib/gettext.py Lib/htmllib.py Lib/HTMLParser.py Lib/httplib.py Lib/idlelib/configDialog.py Lib/idlelib/EditorWindow.py Lib/idlelib/IOBinding.py Lib/idlelib/keybindingDialog.py Lib/idlelib/PyShell.py Lib/idlelib/SearchDialogBase.py Lib/idlelib/tabbedpages.py Lib/idlelib/TreeWidget.py Lib/imaplib.py Lib/inspect.py Lib/lib-tk/turtle.py Lib/locale.py Lib/logging/handlers.py Lib/logging/__init__.py Lib/_LWPCookieJar.py Lib/macpath.py Lib/mailcap.py Lib/markupbase.py Lib/mhlib.py Lib/mimetools.py Lib/mimetypes.py Lib/mimify.py Lib/msilib/__init__.py Lib/nntplib.py Lib/ntpath.py Lib/nturl2path.py Lib/optparse.py Lib/os2emxpath.py Lib/os.py Lib/pdb.py Lib/plat-irix5/flp.py Lib/plat-irix6/flp.py Lib/plat-mac/buildtools.py Lib/plat-mac/gensuitemodule.py Lib/plat-riscos/riscospath.py Lib/pyclbr.py Lib/rfc822.py Lib/robotparser.py Lib/sgmllib.py Lib/SimpleHTTPServer.py Lib/smtpd.py Lib/smtplib.py Lib/socket.py Lib/sqlite3/test/hooks.py Lib/sre_constants.py Lib/stringold.py Lib/stringprep.py Lib/string.py Lib/_strptime.py Lib/subprocess.py Lib/test/regrtest.py Lib/test/test_bigmem.py Lib/test/test_codeccallbacks.py Lib/test/test_codecs.py Lib/test/test_cookielib.py Lib/test/test_datetime.py Lib/test/test_decimal.py Lib/test/test_deque.py Lib/test/test_descr.py Lib/test/test_fileinput.py Lib/test/test_grp.py Lib/test/test_hmac.py Lib/test/test_httplib.py Lib/test/test_os.py Lib/test/test_smtplib.py Lib/test/test_sort.py Lib/test/test_ssl.py Lib/test/test_strop.py Lib/test/test_strptime.py Lib/test/test_support.py Lib/test/test_ucn.py Lib/test/test_unicodedata.py Lib/test/test_urllib2.py Lib/test/test_urllib.py Lib/test/test_wsgiref.py Lib/test/test_xmlrpc.py Lib/urllib2.py Lib/urllib.py Lib/urlparse.py Lib/UserString.py Lib/uuid.py Lib/warnings.py Lib/webbrowser.py Lib/wsgiref/handlers.py Lib/wsgiref/headers.py Lib/wsgiref/simple_server.py Lib/wsgiref/util.py Lib/wsgiref/validate.py Lib/xml/dom/minidom.py Lib/xml/dom/xmlbuilder.py Lib/xmllib.py __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou added the comment: Even if we don't fix all uses of (?to)(lower|upper) in the source tree, I think it's important that codec and locale lookup work properly when the current locale defines non-latin case folding for latin characters. Here is a patch. Perhaps also the str type should grow ascii_lower() and ascii_upper() methods, since many cases of using lower() and upper() actually assume ascii semantics (e.g. for parsing of HTTP or SMTP headers). Added file: http://bugs.python.org/file9440/turklocale.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Antoine Pitrou: -- versions: +Python 2.6 -Python 2.5 __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Marc-Andre Lemburg added the comment: I agree that it's a bit unfortunate that the 8-bit string APIs in Python use the locale aware C functions per default (this should really be reversed: there should be locale-aware .upper() and .lower() methods and the the standard ones should work just like the Unicode ones - without dependency on the locale, using ASCII mappings), but for historical reasons this cannot easily be changed. .lower() and .upper() for 8-bit strings were always locale dependent and before the addition of Unicode, setting the locale was the most common way to make an application understand different character sets. In Python 3k the problem will probably go away, since .lower() and .upper() will then no longer depend on the locale. Perhaps we should just convert a few of the cases you found to using Unicode strings instead of 8-bit strings in 2.6 ?! That would both make the code more portable and also provide a clear statement of this is a text string, making porting to Py3k easier. -- nosy: +lemburg __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Árni Már Jónsson added the comment: There is more to this bug than appears. I'm guessing that the name mangling code in locale (e.g. the normalizing code) is locale dependent. See this example: #!/usr/bin/python2.5 import locale print 'TR', locale.normalize('tr') print locale.setlocale(locale.LC_ALL, ('tr_TR', 'ISO8859-9')) # first issue, not quite the same coming out, as came in print locale.getlocale() # and this fails print locale.setlocale(locale.LC_ALL, ('tr_TR', 'ISO8859-9')) First, the value returned from getlocale is ('tr_TR', 'so8859-9'), not ('tr_TR', 'ISO8859-9'), and the second setlocale fails. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Antoine Pitrou added the comment: I can confirm this on SVN trunk on a Mandriva system. -- nosy: +pitrou __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
Changes by Árni Már Jónsson: -- components: +Library (Lib) -Interpreter Core __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1813] Codec lookup failing under turkish locale
New submission from Árni Már Jónsson: When switching to a turkish locale, the codecs registry fails on a codec lookup which worked before the locale change. This happens when the codec name contains an uppercase 'I'. What happens, is just before doing a cache lookup, the string is normalized, which includes a call to ctype.h's tolower. tolower is locale dependant, and the turkish locale handles 'I's different from other locales. Thus, the lookup fails, since the normalization behaves differently then it did before. Replacing the tolower() call with this made the lookup work: int my_tolower(char c) { if ('A' = c c = 'Z') c += 32; return c; } PS: If the turkish locale is not supported, this here will enable it to an Ubuntu system a) sudo cp /usr/share/i18n/SUPPORTED /var/lib/locales/supported.d/local (or just copy the lines with tr in them) b) sudo dpkg-reconfigure locales -- components: Interpreter Core files: verify_locale.py messages: 59821 nosy: arnimar severity: normal status: open title: Codec lookup failing under turkish locale type: behavior versions: Python 2.5 Added file: http://bugs.python.org/file9140/verify_locale.py __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue1813 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com