Re: iso_8859_1 mystery/tkinter

Jeff Epler Wed, 18 May 2005 16:03:39 -0700

this isn't about the "sign bit", it's about assumed encodings for byte
strings..


In iso_8859_1 and unicode, the character with value 0xb0 is DEGREE SIGN.
In other character sets, that may not be true---For instance, in the
Windows "code page 437", it is u'\u2591' aka LIGHT SHADE (a half-tone pattern).

When you write code like
    x = '%c' % (0xb0)
and then pass x to a Tkinter call, Tkinter treats it as a string encoded
in some system-default encoding, which could give DEGREE SIGN, could
give LIGHT SHADE, or could give other characters (a thai user of Windows
might see THAI CHARACTER THO THAN, for instance, and I would see a
question mark because I use utf-8 and this is an invalid byte sequence).

By using
    x = u'%c' % (0xb0)
you get a unicode string, and there is no confusion about the meaning of
the symbol---you always get DEGREE SIGN.

Jeff

pgpsAO42RPHy5.pgp
Description: PGP signature

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: iso_8859_1 mystery/tkinter

Reply via email to