Martin Marcher <[EMAIL PROTECTED]> wrote:
> 25 Oct 2007 17:37:01 GMT, Brent Lievers <[EMAIL PROTECTED]>:
>> Greetings,
>>
>> I have observed the following (python 2.5.1):
>>
>> >>> import sys
>> >>> print sys.stdout.encoding
>> UTF-8
>> >>> print(u'\u00e9')
>> ?
>> >>> sys.stdout.write(u'\u00e9\n')
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>> position 0: ordinal not in range(128)
>
>>>> sys.stdout.write(u'\u00e9\n'.encode("UTF-8"))
> ?
>
>> Is this correct? My understanding is that print ultimately calls
>> sys.stdout.write anyway, so I'm confused as to why the Unicode error
>> occurs in the second case. Can someone explain?
>
> you forgot to encode what you are going to "print" :)
Thanks. I obviously have a lot to learn about both Python and Unicode ;-)
So does print do this encoding step based on the value of
sys.stdout.encoding? In other words, something like:
sys.stdout.write(textstr.encode(sys.stdout.encoding))
I'm just trying to understand why encode() is needed in the one case but
not the other.
Cheers,
Brent
--
http://mail.python.org/mailman/listinfo/python-list