New submission from rg3 <sarbalap+freshm...@gmail.com>: A recent issue with one of my programs has shown that locale.getdefaultlocale() does not handle correctly a corner case. The issue URL is this one:
http://bitbucket.org/rg3/youtube-dl/issue/7/ Essentially, some users have LANG set to something like es_ca.ut...@valencia. In that case, locale.getdefaultlocale() returns, as the encoding, the string "utf_8_valencia", which cannot be used as an argument to the string encode() function. The obvious correct encoding in this case is UTF-8. I have traced the problem and it seems that it could be fixed by the attached patch. It checks if the encoding, at that point, contains the '@' symbol and, in that case, removes everything starting at that point, leaving only "UTF-8". I am not sure if this patch or a similar one should be applied to other Python versions. My system has Python 2.5.2 and that's what I have patched. Explanation as to why I put the code there: * The simple case, es_CA.UTF-8 goes through that point too and enters the "if". * I wanted to remove what goes after the '@' symbol at that point, so it either needed to be removed before the call to the normalizing function or inside the normalization. * As this is not what I would consider a normalization, I put the code before the function call. Thanks for your hard work. I hope my patch is valid. Regards. ---------- components: Library (Lib) files: locale.diff keywords: patch messages: 86312 nosy: rg3 severity: normal status: open title: locale.getdefaultlocale() missing corner case type: behavior versions: Python 2.5 Added file: http://bugs.python.org/file13737/locale.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5815> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com