[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-13 Thread Tomalak

Tomalak m8r-t1tu...@mailinator.com added the comment:

Francesco,

 if you want to encode the newline character, 
 this should be done by both parseString and 
 setAttribute methods. Otherwise, the 
 behaviour is not symmetric.

I believe you still don't see the issue. The behaviour is not symmetric
*now*. You store a '\n' in an attribute value with setAttribute(), save
the document to XML, load it again and out comes a space where the '\n'
should have been.

The point is that parseString() behaves correctly, but serializing does
not. There is only one side to fix, because only one side is broken.

 If you want to encode the newline in different 
 manner, you should develop a patch that
 introduces this kind of encoding in both 
 parseString and setAttribute methods.

It would be pointless to do the encoding in setAttribute(). The valid
ways to XML-encode a '\n' character are '#xA', '#x0A' or '#10'. Doing
so in setAttribute() would produce doubly encoded output, like this:
'amp;#10'. This is even more wrong.

However, if parseString() encounters a '#10' in the input, it correctly
translates this to '\n' in the DOM. As I said, there is nothing to fix
in parsing, this exercise is about getting minidom to actually *output*
a '#10;' where appropriate. :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-13 Thread Tomalak

Tomalak m8r-t1tu...@mailinator.com added the comment:

Daniel Diniz: 

The proposed behaviour is correct:
http://www.w3.org/TR/2000/WD-xml-c14n-2119.html#charescaping

In attribute values, the character information items 
TAB (#x9), newline (#xA), and carriage-return (#xD) 
are represented by #x9;, #xA;, and #xD; respectively.

Since the behaviour is correct, it is also desirable. :-)

I don't think that this change could cause existing solution to break
since the current inconsistency in handling these characters make it
impossible to rely on this anyway.

Thanks for putting up the unit test diff.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-13 Thread Tomalak

Changes by Tomalak m8r-t1tu...@mailinator.com:


Removed file: http://bugs.python.org/file13919/minidom.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-13 Thread Tomalak

Changes by Tomalak m8r-t1tu...@mailinator.com:


Added file: http://bugs.python.org/file13977/minidom.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-13 Thread Tomalak

Tomalak m8r-t1tu...@mailinator.com added the comment:

I changed the patch to include support for TAB characters, which were
also left unencoded before.

Also I switched encoding from '#13;' etc. to '#xD;'. This is
equivalent, but the spec uses the latter variant.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-12 Thread Francesco Sechi

Francesco Sechi francesco.se...@iet.unipi.it added the comment:

My position is: 
if you want to encode the newline character, this should be done by both
parseString and setAttribute methods. Otherwise, the behaviour is not
symmetric.
My patch translates the newline character with a whitespace in the
setAttribute method, because parseString already does it. If you want to
encode the newline in different manner, you should develop a patch that
introduces this kind of encoding in both parseString and setAttribute
methods.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Francesco Sechi

Francesco Sechi francesco.se...@iet.unipi.it added the comment:

All right, now I understand, thanks. But I think that, for internal
class coherence, it is necessary not to modify toxml method, but the
'setAttribute' one, because this is the source of the problem.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Francesco Sechi

Francesco Sechi francesco.se...@iet.unipi.it added the comment:

A solution for this issue could be to replace the setAttribute method as
follow:
- d[value] = d[nodeValue] = value
+ d[value] = d[nodeValue] = value.replace('\n',' ')

NOTE: I didn't do a patch, because I don't know which python version you
are using.

Please try this solution and give me a feedback, thanks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Francesco Sechi

Changes by Francesco Sechi francesco.se...@iet.unipi.it:


Removed file: http://bugs.python.org/file13837/test_toxml.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Francesco Sechi

Changes by Francesco Sechi francesco.se...@iet.unipi.it:


Added file: http://bugs.python.org/file13960/test_toxml.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Francesco Sechi

Francesco Sechi francesco.se...@iet.unipi.it added the comment:

I have uploaded a test script that shows that, without my patch, the
methods setAttribute and parseString work differently; adding my patch,
the behaviour is symmetric.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Daniel Diniz

Daniel Diniz aja...@gmail.com added the comment:

Francesco,
Your patch still doesn't allow one to add a multiline attribute values
as Tomalak describes:
The catch: This leads to an actual data loss if I *wanted* to store
newline characters in an attribute -- unless the newline characters are
properly encoded. Encoding the newline characters is also valid and
conforms to the spec, so the DOM implementation should do it.

I'm not sure whether the proposed behavior is correct or desirable. Even
if it is correct, it might introduce backwards incompatible changes in
behavior.

Here's a test case for trunk.

--
nosy: +ajaksu2
Added file: http://bugs.python.org/file13966/test_multiline_roundtrip.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-11 Thread Daniel Diniz

Changes by Daniel Diniz aja...@gmail.com:


--
priority:  - normal
stage: test needed - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5752] xml.dom.minidom does not escape newline characters within attribute values

2009-05-10 Thread Tomalak

Changes by Tomalak m8r-t1tu...@mailinator.com:


--
title: xml.dom.minidom does not handle newline characters in attribute values 
- xml.dom.minidom does not escape newline characters within attribute values

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5752
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com