Re: AW: [PythonCE] UnicodeDecodeError with print

Michael Foord Tue, 15 Mar 2005 00:34:17 -0800

Thanks for the reply Sebastian - very helpful. I didn't know about ``sys.stdout.encoding`` which was part of the missing information. You filled in a lot more.

The PDA itself is capable of displaying all sorts of weird and wonderful characters - I wonder if it's possible to access this functionality from pythonce ? I don't *need* to, I'm only curious.

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml


[EMAIL PROTECTED] wrote:

Hello Michael,

this can happen with "normal" Python as well, try this running python.exe
directly.
The issue is not that you create a unicode object, the issue is that you
want to print it. On the PC when using IDLE, sys.stdout.encoding is set to
"cp1252". For file objects, "when Unicode strings are written to a file,
they will be converted to byte strings using this encoding".

On the PDA, however, sys.stdout is a built-in module _pcceshell_support.
Seemingly this module tries to convert unicode objects into strings using
the "ascii" encoding, and this stumbles on the pound sign.

Thus, on the PDA, you have to take care of the conversion of unicode objects
to strings yourself before calling print. For instance:
        u = u'�'
        print u.encode("cp1252")

Now this doesn't fail, but it doesn't print a pound sign as well. But at
least it seems to do the same as if you had written:
        print '�'

[It gets more interesting when the string '�' is contained in a source file.
Try to put the following into a file using IDLE on the PC, and run it:
        # -*- coding: utf-8 -*-
        print "�".encode("hex")
You'll get c2a3. This is because IDLE itself sees the first line of your
file, and encodes the pound sign as c2a3 when storing the file.

Or try in a file:
        # -*- coding: utf-8 -*-
        print [hex(ord(c)) for c in u"�"]
This time, IDLE stored the pound sign as c2a3 again, but Python also uses
the magic first line when building the unicode object to convert c2a3 to a
unicode character.

When you use Pocket Word, or so, to create a Python source file, it will of
course not understand your magic first line and bluntly store the pound sign
as a3. Thus, all of your string literals are actually cp1252-encoded.]

Regards,
Sebastian

-----Urspr�ngliche Nachricht-----
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Auftrag von Michael Foord
Gesendet: Montag, 14. M�rz 2005 13:04
An: [email protected]
Betreff: [PythonCE] UnicodeDecodeError with print


I am wondering if anyone knows the reason as to why :

print u'�'

should cause a UnicodeDecodeError on pythonce ? (The usual 'ascii codec
cannot decode character...' message).

Obviously the '�' character is a non-ascii character. I am just
surprised that the print statement is using the ascii encoding at all
and not just passing the string to sys.stdout.

The particular reason I ask is that this doesn't happen with 'normal'
python... but I would like to know how the print statement decodes
unicode strings it prints. Since it *doesn't* raise an error normally it
obviously doesn't use defaultencoding - so why does the pyhonce one ?

Yours curiously,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
_______________________________________________
PythonCE mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pythonce


_______________________________________________
PythonCE mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pythonce

Re: AW: [PythonCE] UnicodeDecodeError with print

Reply via email to