[issue9692] UnicodeDecodeError in ElementTree.tostring()

2012-07-21 Thread Florent Xicluna

Florent Xicluna florent.xicl...@gmail.com added the comment:

I propose to close this as won't fix.

The upgrade to ElementTree 1.3 brought some consistency when dealing with 
Unicode and encodings.

The reported behavior was only seen in Python 2.7, when using bytes improperly.

--
nosy: +eli.bendersky
resolution:  - wont fix
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2011-01-11 Thread Ulrich Seidl

Ulrich Seidl ulrich.se...@muneda.com added the comment:

I would suggest adding an additional except branch to (at least) the following 
functions of ElementTree.py:
* _encode,
* _escape_attrib, and
* _escape_cdata 

The except branch could look like:

except (UnicodeDecodeError):
return text.decode( encoding ).encode( encoding, xmlcharrefreplace)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Ulrich Seidl

New submission from Ulrich Seidl ulrich.se...@muneda.com:

The following code leads to an UnicodeError in python 2.7 while it works fine 
in 2.6  2.5:

# -*- coding: latin-1 -*-
import xml.etree.cElementTree as ElementTree

oDoc = ElementTree.fromstring(
'?xml version=1.0 encoding=iso-8859-1?ROOT/' )
oDoc.set( ATTR, ÄÖÜ )
print ElementTree.tostring( oDoc , encoding=iso-8859-1 )

--
components: XML
messages: 114980
nosy: uis
priority: normal
severity: normal
status: open
title: UnicodeDecodeError in ElementTree.tostring()
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Brian Curtin

Changes by Brian Curtin cur...@acm.org:


--
nosy: +flox
stage:  - needs patch
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

IMO the code is not correct: how does ElementTree know which encoding is used 
for the attribute value?  Even 2.5 prints a different content when the script 
is saved with a different encoding.

The line should look like:
oDoc.set( ATTR, uÄÖÜ )
or use ascii-only characters.

--
nosy: +amaury.forgeotdarc

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Ulrich Seidl

Ulrich Seidl ulrich.se...@muneda.com added the comment:

Of course, if you use an unicode string it works and of course it would be easy 
to switch to unicode for this demo code. Unfortunately, the affected 
application is a little bit more complex and it is not that easy to switch to 
unicode. I just wonder why the tostring() method does not assume that internal 
strings are encoded in the explicitly provided encoding? Is ElementTree 
restricted to the use of unicode strings? Anyway, why was it working (as 
expected) with python 2.5  python 2.6?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

Testing with python 2.5: oDoc.set(ATTR, ÄÖÜ) uses the encoding used by the 
source code (with # -*- coding:;) If I use utf-8 instead, the output is:
   ROOT ATTR=#195;#132;#195;#150;#195;#156; /
which contains the numbers of the 3 pairs of surrogates.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9692] UnicodeDecodeError in ElementTree.tostring()

2010-08-26 Thread Ulrich Seidl

Ulrich Seidl ulrich.se...@muneda.com added the comment:

Well, the output of the print is not that interesting as long as ElementTree is 
able the restore the former attributes value when reading it in again. The 
print was just used to illustrate that an UnicodeDecodeError appears. Think 
about doing an 
ElementTree.fromstring( ... ).get( ATTR ).encode( iso-8859-1 ).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9692
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com