New submission from Moriyoshi Koizumi <mozo+pyt...@mozo.jp>: ElementTree doesn't correctly serialize end-of-line characters (#xa, #xd) in attribute values. Since bare end-of-line characters are converted to #x20 by the parser according to the specification [1], such characters that are represented as character references in the original document must be serialized in the same form.
[1] http://www.w3.org/TR/xml11/#AVNormalize ### sample code from xml.etree.ElementTree import ElementTree from cStringIO import StringIO # builder = ElementTree(file=StringIO("<foo>\x0d</foo>")) # out = StringIO() # builder.write(out) # print out.getvalue() out = StringIO() ElementTree(file=StringIO( '''<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE foo [ <!ELEMENT foo (#PCDATA)> <!ATTLIST foo attr CDATA ""> ]> <foo attr=" test test  test a "> </foo> ''')).write(out) # should be "<foo attr=" test test test a ">\x0a</foo> print out.getvalue() out = StringIO() ElementTree(file=StringIO( '''<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE foo [ <!ELEMENT foo (#PCDATA)> <!ATTLIST foo attr NMTOKENS ""> ]> <foo attr=" test test  test a "> </foo> ''')).write(out) # should be "<foo attr="test test test a">\x0a</foo> print out.getvalue() ---------- components: XML messages: 94074 nosy: moriyoshi severity: normal status: open title: Incorrect serialization of end-of-line characters in attribute values type: behavior versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue7139> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com