Attached is the smallest test case, that shows that ElementTree returns a string object if the text in the tree is only ascii, but returns a unicode object otherwise.
This would make sense if the sting object and unicode object were interchangeable... but they are not - one example, the translate method is completelly different. I've tested with cElementTree (1.0.2) too, it has the same behaviour. Any suggestions? Do I need to check the output of ElementTree everytime, or there's some hidden switch to change this behaviour? from elementtree import ElementTree xml = """\ <?xml version="1.0" encoding="UTF-8"?> <root> <p1> ascii </p1> <p2> \xd0\xba\xd0\xb8\xd1\x80\xd0\xb8\xd0\xbb\xd0\xb8\xd1\x86\xd0\xb0 </p2> </root> """ tree = ElementTree.fromstring(xml) p1, p2 = tree.getchildren() print "type(p1.text):", type(p1.text) print "type(p2.text):", type(p2.text) -- http://mail.python.org/mailman/listinfo/python-list