[issue17089] Expat parser parses strings only when XML encoding is UTF-8

2013-05-22 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
versions:  -Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17089] Expat parser parses strings only when XML encoding is UTF-8

2013-02-13 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17089] Expat parser parses strings only when XML encoding is UTF-8

2013-02-04 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 3cc2a2de36e3 by Serhiy Storchaka in branch '3.2':
Issue #17089: Expat parser now correctly works with string input not only when
http://hg.python.org/cpython/rev/3cc2a2de36e3

New changeset 6c27b0e09c43 by Serhiy Storchaka in branch '3.3':
Issue #17089: Expat parser now correctly works with string input not only when
http://hg.python.org/cpython/rev/6c27b0e09c43

New changeset c4e6e560e6f5 by Serhiy Storchaka in branch 'default':
Issue #17089: Expat parser now correctly works with string input not only when
http://hg.python.org/cpython/rev/c4e6e560e6f5

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17089] Expat parser parses strings only when XML encoding is UTF-8

2013-01-31 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

xmlparser.Parse() works with string data only if XML encoding is utf-8 (or 
ascii). Examples:

>>> import xml.parsers.expat
>>> parser = xml.parsers.expat.ParserCreate()
>>> content = []
>>> parser.CharacterDataHandler = content.append
>>> parser.Parse("\xb5")
1
>>> content
['µ']
>>> parser = xml.parsers.expat.ParserCreate()
>>> content = []
>>> parser.CharacterDataHandler = content.append
>>> parser.Parse("\xb5")
1
>>> content
['µ']
>>> parser = xml.parsers.expat.ParserCreate()
>>> content = []
>>> parser.CharacterDataHandler = content.append
>>> parser.Parse("\xb5")
Traceback (most recent call last):
  File "", line 1, in 
xml.parsers.expat.ExpatError: encoding specified in XML declaration is 
incorrect: line 1, column 30

This affects all other modules which works with XML: xml.sax, xml.dom.minidom, 
xml.dom.pulldom, xml.etree.ElementTree.

Here is a patch which fixes parsing string data with non-UTF-8 XML.

--
assignee: serhiy.storchaka
components: Extension Modules, Unicode, XML
files: pyexpat_parse_str.patch
keywords: patch
messages: 181014
nosy: ezio.melotti, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Expat parser parses strings only when XML encoding is UTF-8
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4
Added file: http://bugs.python.org/file28916/pyexpat_parse_str.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com