[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d3e7aea8a550 by Serhiy Storchaka in branch '2.7':
Issue #11159: SAX parser now supports unicode file names.
http://hg.python.org/cpython/rev/d3e7aea8a550

New changeset d2622ca8493a by Serhiy Storchaka in branch '3.2':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/d2622ca8493a

New changeset b85ba45b9579 by Serhiy Storchaka in branch '3.3':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/b85ba45b9579

New changeset 107a06f1a542 by Serhiy Storchaka in branch 'default':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/107a06f1a542

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Fixed. Thank you for the report.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 706218e0facb by Serhiy Storchaka in branch '2.7':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/706218e0facb

New changeset a7c074d9cbfb by Serhiy Storchaka in branch '3.2':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/a7c074d9cbfb

New changeset 2bf01f03ff40 by Serhiy Storchaka in branch '3.3':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/2bf01f03ff40

New changeset 4ab386b00aaf by Serhiy Storchaka in branch 'default':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/4ab386b00aaf

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Yes, this thing was doubted me too. I proceeded from the following 
considerations.

1. Often system id is used for file operations and in this case you need to use 
the file system encoding. Unfortunately Python 2 does not have 
'surrogateescape' handler which would allow to encode arbitrary name and then 
restore and re-encode it for file operations.

2. Python 2 in contrary to Python 3 accepts bytes and they may not be valid 
UTF-8.

We have to choose between compatibility with Python 2 and Python 3. I chose the 
first, because it is more important for bugfix.

May be I am wrong.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is an alternative patch. It doesn't encode system id when it settled, 
instead system id attribute can be bytes or an unicode and encoding/decoding 
happened only a file opened.

--
Added file: http://bugs.python.org/file28722/sax_unicode_fn_alt-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Removed file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Added file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Ported tests for nonascii System-Id on 3.x.

If no one objects I'll commit this next week.

--
Added file: http://bugs.python.org/file28714/sax_unicode_fn-3.x.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Christian Heimes

Christian Heimes added the comment:

I don't think that the file system encoding is the correct answer here. AFAIR 
expat uses UTF-8 encoded strings. Python 3.x uses PyArg_ParseTupleAndKeywords() 
with s which converts PyUnicode to PyBytes with the utf-8 codec.

--
nosy: +christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-11 Thread Sergey Prokhorov

Changes by Sergey Prokhorov sergey.prokho...@gmail.com:


--
nosy: +Sergey.Prokhorov

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-29 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
assignee:  - serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

However Python doesn't work with bytes filenames (I don't think this is a bug).

The proposed patch allows unicode filenames be used in SAX parser.

--
keywords: +patch
nosy: +serhiy.storchaka
stage:  - patch review
Added file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-08 Thread Carsten Grohmann

Changes by Carsten Grohmann carstengrohm...@gmx.de:


--
nosy: +cgrohmann

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-08 Thread Daniel Urban

Changes by Daniel Urban urban.dani...@gmail.com:


--
type: crash - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2011-08-21 Thread John Chandler

John Chandler therealmetal...@gmail.com added the comment:

Confirmed about not being an issue in Python 3. Just checked with Python 
3.3.0a0 and the example works fine - no exception raised.

--
nosy: +John.Chandler

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2011-02-09 Thread Rickard Lindberg

New submission from Rickard Lindberg ricl...@gmail.com:

The error is the following:

Traceback (most recent call last):
  File stdin, line 4, in module
  File /usr/lib64/python2.7/site-packages/_xmlplus/sax/__init__.py, line 
31, in parse
parser.parse(filename_or_stream)
  File /usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py, 
line 109, in parse
xmlreader.IncrementalParser.parse(self, source)
  File /usr/lib64/python2.7/site-packages/_xmlplus/sax/xmlreader.py, line 
119, in parse
self.prepareParser(source)
  File /usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py, 
line 121, in prepareParser
self._parser.SetBase(source.getSystemId())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in 
position 0: ordinal not in range(128)

The following bash script can be used to reproduce the error:

#!/bin/sh

cat  å.timeline EOF
?xml version=1.0 encoding=utf-8?
timeline
  version0.13.0devb38ace0a572b+/version
  categories
  /categories
  events
event
  start2011-02-01 00:00:00/start
  end2011-02-03 08:46:00/end
  textasdsd/text
/event
  /events
  view
displayed_period
  start2011-01-24 16:38:11/start
  end2011-02-23 16:38:11/end
/displayed_period
hidden_categories
/hidden_categories
  /view
/timeline
EOF

python EOF
# encoding: utf-8
from xml.sax import parse
from xml.sax.handler import ContentHandler
parse(open(uå.timeline, 'r'), ContentHandler())
EOF

If I instead do this, it works fine:

parse(uå.timeline.encode(utf-8), ContentHandler())

Also:

 sys.getfilesystemencoding()
'UTF-8'

I heard from another user that this was not a problem with Python 3.1.2.

--
components: XML
messages: 128212
nosy: ricli85
priority: normal
severity: normal
status: open
title: Sax parser crashes if given unicode file name
type: crash
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11159
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com