Ezio Melotti <ezio.melo...@gmail.com> added the comment: I did some test as well and here is what I got: Python2.4 WinXP: >>> import locale >>> import string >>> locale.setlocale(locale.LC_ALL, '') 'Italian_Italy.1252' >>> string.lowercase 'abcdefghijklmnopqrstuvwxyz\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff' >>> print string.lowercase abcdefghijklmnopqrstuvwxyzâܣ׬Á║▀ÓßÔÒõÕµþÞÚÛÙýݯ´±‗¾¶§÷°¨·¹³²■ >>> import unicodedata >>> set(map(unicodedata.category, string.lowercase.decode('windows-1252'))) set(['Ll'])
Python2.6 WinXP: >>> import locale >>> import string >>> locale.setlocale(locale.LC_ALL, '') 'Italian_Italy.1252' >>> string.lowercase 'abcdefghijklmnopqrstuvwxyz\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff' >>> print string.lowercase abcdefghijklmnopqrstuvwxyzƒsozªµºßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ >>> import unicodedata >>> set(map(unicodedata.category, string.lowercase.decode('windows-1252'))) set(['Ll']) As you can see both the strings are equivalent and all the chars correctly belong to the Ll (letter, lowercase) Unicode category. For some reason they look different only when they are printed. If these chars are not added to string.lowercase on Linux when you change the locale, then it's a bug. Can you reproduce it with recent versions of Python? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6525> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com