On Fri, Feb 04, 2011 at 11:42:51AM +0100, Jakub Wilk wrote: > Disclaimer: I'm not maintainer of this package. > > * Yann Dirson <[email protected]>, 2011-02-03, 20:56: > >In [1]: import xdg.DesktopEntry > > > >In [3]: e=xdg.DesktopEntry.DesktopEntry() > > > >In [4]: e.parse('plugins/Games/Chess.desktop') > > > >In [5]: e.getName() > >Out[5]: u'\xc9checs' > > > >In [6]: print "%s" % e.getName() > >------> print("%s" % e.getName()) > >Échecs > > > > > >Now, if I use LC_ALL=fr_FR or fr_FR.ISO-8859-1 (which should be > >equivalent), the final step instead throws: > > > >UnicodeEncodeError: 'ascii' codec can't encode character u'\xc9' in position > >0: ordinal not in range(128) > > I assume that, as the subject suggest, it fails only if there is no > fr_FR.ISO-8859-1 locale available. Am I correct? (If this is the > case, perhaps something like fr_FR.ISO-8859-42 would be a better > test-case, as it's less like to exist.)
Right. > >By contrast, other programs in the same condition fallback to C > >locale, which results in no error. I guess the xdg module should do > >something similar. > > It is true that xdg uses a bit different language lookup algorithm > that GNU gettext does. I can see it is flawed in a few ways and I > can see your point. However, I don't think your particular use case > is of much significance, for the following reasons: > > 1. If you are using non-existent locales, you shoot yourself in the > foot. :) Yes, but users may inadvertently request a locale with implicit encoding without realizing it is not the encoding they want (which is exactly what happenned to me, and I had not realized at first what the problem was) > 2. getName() returned a Unicode string for in French, which was kind > of what you asked for. Well, what I was asking for primarily was a string that would match the locale, so it would fit into the GUI - and in that respect I got something different than I was expecting. > 3. If you want your application to be robust, you should not print > Unicode strings blindly. Encoding of sys.stdout can be ASCII even if > proper UTF-8 locale is set: > > $ locale charmap > UTF-8 > > $ python -c 'print u"\xde"' > Þ > > $ python -c 'print u"\xde"' | cat > Traceback (most recent call last): > File "<string>", line 1, in <module> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xde' in position > 0: ordinal not in range(128) Right, the problem of spitting debugging output using "print" is a different one, and I had not realized that python was diregarding the locale when the output is redirected. I'll have to dig into this, do you have any pointer ? Best regards, -- Yann _______________________________________________ Python-modules-team mailing list [email protected] http://lists.alioth.debian.org/mailman/listinfo/python-modules-team

