Re: [Zope3-dev] zope.tal.xmlparser.XMLParser() dislikes unicode
--On 14. Januar 2007 10:48:06 +0100 Bernd Dorn <[EMAIL PROTECTED]> wrote: I am not sure if this behavior is intentional?! Is the XMLParser supposed to deal with unicode strings or will it only accept a standard Python string? A workaround inside parseString() would to check for unicode and convert the string on-the-fly to a Python string with utf-8 encoding. This is possibly a limitation of the underlying Expat parser...any recommendation how to deal with this issue? IMHO it should only accept strings, because in the value should be a xml string and therefore always has to be encoded in 'utf-8' or in the encoding specified in the processing instruction. I disagree with that. Since Zope 3 is supposed to use unicode internally (at least that's the legend) it should support unicode also at the parser level. Other languages like Java store XML also as unicode strings and support parsing it. Andreas pgp8ib4BIWYFC.pgp Description: PGP signature ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] zope.tal.xmlparser.XMLParser() dislikes unicode
On 13.01.2007, at 18:49, Andreas Jung wrote: Hi, the XMLParser.parseString() method raises an exception File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run testMethod() File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/ tests/test_xmlparser.py", line 127, in test_xx self._run_check(xml, ()) File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/ tests/test_xmlparser.py", line 106, in _run_check parser.parseString(source) File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/ xmlparser.py", line 77, in parseString self.parser.Parse(s, 1) UnicodeEncodeError: 'ascii' codec can't encode characters in position 43-48: ordinal not in range(128) if the string to be parsed is a unicode strings and contains some non-ascii chars. The following snippet from a private unittest (test_xmlparsers.py) shows the error. def test_xx(self): xml = unicode('>üöä', 'iso-8859-15') self._run_check(xml, ()) I am not sure if this behavior is intentional?! Is the XMLParser supposed to deal with unicode strings or will it only accept a standard Python string? A workaround inside parseString() would to check for unicode and convert the string on-the-fly to a Python string with utf-8 encoding. This is possibly a limitation of the underlying Expat parser...any recommendation how to deal with this issue? IMHO it should only accept strings, because in the value should be a xml string and therefore always has to be encoded in 'utf-8' or in the encoding specified in the processing instruction. Bernd Andras ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/zope- mailinglist%40mopa.at ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
[Zope3-dev] zope.tal.xmlparser.XMLParser() dislikes unicode
Hi, the XMLParser.parseString() method raises an exception File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run testMethod() File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/tests/test_xmlparser.py", line 127, in test_xx self._run_check(xml, ()) File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/tests/test_xmlparser.py", line 106, in _run_check parser.parseString(source) File "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/xmlparser.py", line 77, in parseString self.parser.Parse(s, 1) UnicodeEncodeError: 'ascii' codec can't encode characters in position 43-48: ordinal not in range(128) if the string to be parsed is a unicode strings and contains some non-ascii chars. The following snippet from a private unittest (test_xmlparsers.py) shows the error. def test_xx(self): xml = unicode('encoding="utf-8"?>üöä', 'iso-8859-15') self._run_check(xml, ()) I am not sure if this behavior is intentional?! Is the XMLParser supposed to deal with unicode strings or will it only accept a standard Python string? A workaround inside parseString() would to check for unicode and convert the string on-the-fly to a Python string with utf-8 encoding. This is possibly a limitation of the underlying Expat parser...any recommendation how to deal with this issue? Andras pgpqL51ow2oL9.pgp Description: PGP signature ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com