now it looks like the total confusion seems to clear up (at least partially). After some googling it
seems to me that the best bet is to use unicode strings exclusively.
I think that is a good plan.
When I set the unicode flag
in gettext.install() to 1 the gettext strings are unicode, however there's still a problem with the user input. As you guessed, "self.nextfile" is unicode only *sometimes*; I tried and changed the line from the old traceback into:
if unicode(self.nextfile, 'iso8859-1') == _('No destination file selected'):
How about n = self.nextfile if not isinstance(n, unicode): n = unicode(n, 'iso8859-1') ?
At least this might explain why "A\xe4" worked and "\xe4" not as I mentioned in a previous post. Now the problem arises how to determine if self.nextfile is unicode or a byte string? Or maybe even better, make sure that self.nextfile is always a byte string so I can safely convert it to unicode later on. But how to convert unicode user input into byte strings when I don't even know the user's encoding ? I guess this will require some further research.
Why do you need to convert back to byte strings?
You can find out the console encoding from sys.stdin and stdout: >>> import sys >>> sys.stdout.encoding 'cp437' >>> sys.stdin.encoding 'cp437'
IIRC there is also an encoding associated with the current locale, I'm not sure how to use that.
Unfortunately the latter is no option, because I definitely need portability. I guess I should probably use
utf-8.
UTF-8 is your friend :-)
Kent
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor