Uli Kunitz <uli.kun...@googlemail.com> added the comment:

I believe handling of TextIOWrapper streams is broken in 
xml.etree.ElementTree.ElementTree.write().

First example:

import sys
from xml.etree import ElementTree

element = ElementTree.fromstring("""<foo><bar>foobar</bar></foo>""")
element_tree = ElementTree.ElementTree(element)

assert sys.stdout.encoding == "UTF-8"
element_tree.write(sys.stdout, encoding="UTF-8")
print()

I don't think that write a tree into a stream with the correct encoding should 
generate any problem at all.

The output looks like this:

Traceback (most recent call last):
  File "/home/kunitz/test/lib/python3.2/xml/etree/ElementTree.py", line 825, in 
write
    "xmlcharrefreplace"))
TypeError: must be str, not bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bug1.py", line 9, in <module>
    element_tree.write(sys.stdout, encoding="UTF-8")
  File "/home/kunitz/test/lib/python3.2/xml/etree/ElementTree.py", line 843, in 
write
    write("<?xml version='1.0' encoding='%s'?>\n" % encoding_)
  File "/home/kunitz/test/lib/python3.2/xml/etree/ElementTree.py", line 827, in 
write
    _raise_serialization_error(text)
  File "/home/kunitz/test/lib/python3.2/xml/etree/ElementTree.py", line 1077, 
in _raise_serialization_error
    "cannot serialize %r (type %s)" % (text, type(text).__name__)
TypeError: cannot serialize "<?xml version='1.0' encoding='UTF-8'?>\n" (type 
str)

Example 2:
import sys
from xml.etree import ElementTree

element = ElementTree.fromstring("""<foo><bar>fööbar</bar></foo>""")
element_tree = ElementTree.ElementTree(element)

with open("bug2.xml", "w", encoding="US-ASCII") as f:
    element_tree.write(f)

The first ö umlaut generates an UnicodeEncodeError here, while the method could 
use XML character references. One could argue this, but the method could take 
care of the problem.

Third example:
import sys
from xml.etree import ElementTree

element = ElementTree.fromstring("""<foo><bar>fööbar</bar></foo>""")
element_tree = ElementTree.ElementTree(element)

with open("bug3.xml", "w", encoding="ISO-8859-1",
          errors="xmlcharrefreplace") as f:
    element_tree.write(f, xml_declaration=True)

This creates finally an ISO-8859-1 encoded XML file, but without XML 
declaration. Didn't we request one?

Example 4: Try to do the right thing.
import sys
from xml.etree import ElementTree

element = ElementTree.fromstring("""<foo><bar>fööbar</bar></foo>""")
element_tree = ElementTree.ElementTree(element)

with open("bug4.xml", "w", encoding="ISO-8859-1",
          errors="xmlcharrefreplace") as f:
    element_tree.write(f, encoding="ISO-8859-1", xml_declaration=True)

Here we get the same exception as example 1 of course.

All the files can be found in the tar container below.

----------
title: xml.etree.ElementTree.write(): encoding handling problems -> 
xml.etree.ElementTree.ElementTree.write(): encoding handling problems
type:  -> behavior
Added file: http://bugs.python.org/file18349/bugs.tar.gz

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9458>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to