Re: Python parsing iTunes XML/COM

Jerry Hill Thu, 31 Jul 2008 14:51:37 -0700

On Thu, Jul 31, 2008 at 9:44 AM, william tanksley <[EMAIL PROTECTED]> wrote:
> I'm using a file, a file that's correctly encoded as UTF-8, and it
> returns some text elements that are raw bytes (undecoded). I have to
> manually decode them.


I can't reproduce this behavior.  Here's a simple test case:

C:\Program Files\Python25>python -V
Python 2.5.2

C:\Program Files\Python25>more t.py
import xml.etree.cElementTree as ET

xml_string = """<?xml version="1.0" encoding="UTF-8"?>
<character title="GREEK SMALL LETTER PI">\xcf\x80</character>"""

outfile = open('sample.xml', 'wb')
outfile.write(xml_string)
outfile.close()

tree = ET.parse('sample.xml')
root = tree.getroot()
print type(root.text)
print repr(root.text)
print root.text


C:\Program Files\Python25>python t.py
<type 'unicode'>
u'\u03c0'
π

That seems to work as expected.  I wrote out a UTF-8 encoded
bytestring with a proper xml encoding statement.  When I parsed the
file with cElementTree, it returned unicode data.  Does this same
program work for you?  If so, maybe you need to show us more of your
code to see where things are going wrong.

-- 
Jerry
--
http://mail.python.org/mailman/listinfo/python-list

Re: Python parsing iTunes XML/COM

Reply via email to