On Wed, 23 Feb 2005 07:21:40 -0500 Kent Johnson <[EMAIL PROTECTED]> wrote:
> > This is a part of Python that still confuses me. I think what is happening is > - self.nextfile is a Unicode string sometimes (when it includes special > characters) > - the gettext string is a byte string > - to compare the two, the byte string is promoted to Unicode by decoding it > with the system default > encoding, which is generally 'ascii'. > - the gettext string includes non-ascii characters and the codec raises an > exception. > Thanks Kent, now it looks like the total confusion seems to clear up (at least partially). After some googling it seems to me that the best bet is to use unicode strings exclusively. When I set the unicode flag in gettext.install() to 1 the gettext strings are unicode, however there's still a problem with the user input. As you guessed, "self.nextfile" is unicode only *sometimes*; I tried and changed the line from the old traceback into: if unicode(self.nextfile, 'iso8859-1') == _('No destination file selected'): Now when self.nextfile is an existing file "\xe4.wav" that was clicked on in the file dialog's file list this works, however when I type "\xe4.wav" into the file dialog's entry field I get: TypeError Exception in Tk callback Function: <bound method Snackrecorder.start of <snackrecorder.Snackrecorder instance at 0xb774518c>> (type: <type 'instancemethod'>) Args: () Traceback (innermost last): File "/usr/lib/python2.3/site-packages/Pmw/Pmw_1_2/lib/PmwBase.py", line 1747, in __call__ return apply(self.func, args) File "/usr/local/share/phonoripper-0.6.2/snackrecorder.py", line 304, in start if unicode(self.nextfile, 'iso8859-1') == _('No destination file selected'): TypeError: decoding Unicode is not supported At least this might explain why "A\xe4" worked and "\xe4" not as I mentioned in a previous post. Now the problem arises how to determine if self.nextfile is unicode or a byte string? Or maybe even better, make sure that self.nextfile is always a byte string so I can safely convert it to unicode later on. But how to convert unicode user input into byte strings when I don't even know the user's encoding ? I guess this will require some further research. > I don't know what the best solution is. Two possibilities (substitute your > favorite encoding for > latin-1): > - decode the gettext string, e.g. > if self.nextfile == _('No destination file selected').decode('latin-1'): > > - set your default encoding to latin-1. (This solution is frowned on by the > Python-Unicode > cognoscenti and it makes your programs non-portable). Do this by creating a > file > site-packages/sitecustomize.py containing the lines > import sys > sys.setdefaultencoding('latin-1') > > Kent > Unfortunately the latter is no option, because I definitely need portability. I guess I should probably use utf-8. Thanks and best regards Michael > > > > ###################################################################### > > Error: 1 > > UnicodeDecodeError Exception in Tk callback > > Function: <bound method Snackrecorder.start of > > <snackrecorder.Snackrecorder instance at 0xb77fe24c>> (type: <type > > 'instancemethod'>) > > Args: () > > Traceback (innermost last): > > File "/usr/lib/python2.3/site-packages/Pmw/Pmw_1_2/lib/PmwBase.py", line > > 1747, in __call__ > > return apply(self.func, args) > > File "/usr/local/share/phonoripper/snackrecorder.py", line 305, in start > > if self.nextfile == _('No destination file selected'): > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 22: > > ordinal not in range(128) > > > > ###################################################################### > > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor