[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Tomalak m8r-t1tu...@mailinator.com added the comment: Francesco, if you want to encode the newline character, this should be done by both parseString and setAttribute methods. Otherwise, the behaviour is not symmetric. I believe you still don't see the issue. The behaviour is not symmetric *now*. You store a '\n' in an attribute value with setAttribute(), save the document to XML, load it again and out comes a space where the '\n' should have been. The point is that parseString() behaves correctly, but serializing does not. There is only one side to fix, because only one side is broken. If you want to encode the newline in different manner, you should develop a patch that introduces this kind of encoding in both parseString and setAttribute methods. It would be pointless to do the encoding in setAttribute(). The valid ways to XML-encode a '\n' character are '#xA', '#x0A' or '#10'. Doing so in setAttribute() would produce doubly encoded output, like this: 'amp;#10'. This is even more wrong. However, if parseString() encounters a '#10' in the input, it correctly translates this to '\n' in the DOM. As I said, there is nothing to fix in parsing, this exercise is about getting minidom to actually *output* a '#10;' where appropriate. :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Tomalak m8r-t1tu...@mailinator.com added the comment: Daniel Diniz: The proposed behaviour is correct: http://www.w3.org/TR/2000/WD-xml-c14n-2119.html#charescaping In attribute values, the character information items TAB (#x9), newline (#xA), and carriage-return (#xD) are represented by #x9;, #xA;, and #xD; respectively. Since the behaviour is correct, it is also desirable. :-) I don't think that this change could cause existing solution to break since the current inconsistency in handling these characters make it impossible to rely on this anyway. Thanks for putting up the unit test diff. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Tomalak m8r-t1tu...@mailinator.com: Removed file: http://bugs.python.org/file13919/minidom.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Tomalak m8r-t1tu...@mailinator.com: Added file: http://bugs.python.org/file13977/minidom.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Tomalak m8r-t1tu...@mailinator.com added the comment: I changed the patch to include support for TAB characters, which were also left unencoded before. Also I switched encoding from '#13;' etc. to '#xD;'. This is equivalent, but the spec uses the latter variant. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Francesco Sechi francesco.se...@iet.unipi.it added the comment: My position is: if you want to encode the newline character, this should be done by both parseString and setAttribute methods. Otherwise, the behaviour is not symmetric. My patch translates the newline character with a whitespace in the setAttribute method, because parseString already does it. If you want to encode the newline in different manner, you should develop a patch that introduces this kind of encoding in both parseString and setAttribute methods. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Francesco Sechi francesco.se...@iet.unipi.it added the comment: All right, now I understand, thanks. But I think that, for internal class coherence, it is necessary not to modify toxml method, but the 'setAttribute' one, because this is the source of the problem. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Francesco Sechi francesco.se...@iet.unipi.it added the comment: A solution for this issue could be to replace the setAttribute method as follow: - d[value] = d[nodeValue] = value + d[value] = d[nodeValue] = value.replace('\n',' ') NOTE: I didn't do a patch, because I don't know which python version you are using. Please try this solution and give me a feedback, thanks. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Francesco Sechi francesco.se...@iet.unipi.it: Removed file: http://bugs.python.org/file13837/test_toxml.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Francesco Sechi francesco.se...@iet.unipi.it: Added file: http://bugs.python.org/file13960/test_toxml.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Francesco Sechi francesco.se...@iet.unipi.it added the comment: I have uploaded a test script that shows that, without my patch, the methods setAttribute and parseString work differently; adding my patch, the behaviour is symmetric. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Daniel Diniz aja...@gmail.com added the comment: Francesco, Your patch still doesn't allow one to add a multiline attribute values as Tomalak describes: The catch: This leads to an actual data loss if I *wanted* to store newline characters in an attribute -- unless the newline characters are properly encoded. Encoding the newline characters is also valid and conforms to the spec, so the DOM implementation should do it. I'm not sure whether the proposed behavior is correct or desirable. Even if it is correct, it might introduce backwards incompatible changes in behavior. Here's a test case for trunk. -- nosy: +ajaksu2 Added file: http://bugs.python.org/file13966/test_multiline_roundtrip.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Daniel Diniz aja...@gmail.com: -- priority: - normal stage: test needed - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5752] xml.dom.minidom does not escape newline characters within attribute values
Changes by Tomalak m8r-t1tu...@mailinator.com: -- title: xml.dom.minidom does not handle newline characters in attribute values - xml.dom.minidom does not escape newline characters within attribute values ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5752 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com