Fredrik Lundh <fred...@effbot.org> added the comment: > if I don't specify an encoding, I get unicode. If I do specify an encoding, > I get encoded bytes.
You're confusing the XML document encoding with character set encoding. A serialized (unparsed) XML document is a byte stream, not a string of Unicode characters. And the character set encoding is both embedded in that byte stream and affects how it's generated in more than one way; you cannot just recode XML documents nilly willy and expect things to work. A parsed XML document (an infoset) -- for ET, that's the tree of Element objects -- does indeed contain Unicode strings, but the transformation from the byte stream to the Unicode string doesn't just involve character set decoding; there are several other constructs that are handled by the XML parser. > Ha. There has been a very long temporal window You should have had plenty of time to fix it, then, right? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8047> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com