[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 706218e0facb by Serhiy Storchaka in branch '2.7':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/706218e0facb

New changeset a7c074d9cbfb by Serhiy Storchaka in branch '3.2':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/a7c074d9cbfb

New changeset 2bf01f03ff40 by Serhiy Storchaka in branch '3.3':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/2bf01f03ff40

New changeset 4ab386b00aaf by Serhiy Storchaka in branch 'default':
Fix tests for issue #11159.
http://hg.python.org/cpython/rev/4ab386b00aaf

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Fixed. Thank you for the report.

--
resolution:  -> fixed
stage: patch review -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-02-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d3e7aea8a550 by Serhiy Storchaka in branch '2.7':
Issue #11159: SAX parser now supports unicode file names.
http://hg.python.org/cpython/rev/d3e7aea8a550

New changeset d2622ca8493a by Serhiy Storchaka in branch '3.2':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/d2622ca8493a

New changeset b85ba45b9579 by Serhiy Storchaka in branch '3.3':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/b85ba45b9579

New changeset 107a06f1a542 by Serhiy Storchaka in branch 'default':
Issue #11159: Add tests for testing SAX parser support of non-ascii file names.
http://hg.python.org/cpython/rev/107a06f1a542

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is an alternative patch. It doesn't encode system id when it settled, 
instead system id attribute can be bytes or an unicode and encoding/decoding 
happened only a file opened.

--
Added file: http://bugs.python.org/file28722/sax_unicode_fn_alt-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Yes, this thing was doubted me too. I proceeded from the following 
considerations.

1. Often system id is used for file operations and in this case you need to use 
the file system encoding. Unfortunately Python 2 does not have 
'surrogateescape' handler which would allow to encode arbitrary name and then 
restore and re-encode it for file operations.

2. Python 2 in contrary to Python 3 accepts bytes and they may not be valid 
UTF-8.

We have to choose between compatibility with Python 2 and Python 3. I chose the 
first, because it is more important for bugfix.

May be I am wrong.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Christian Heimes

Christian Heimes added the comment:

I don't think that the file system encoding is the correct answer here. AFAIR 
expat uses UTF-8 encoded strings. Python 3.x uses PyArg_ParseTupleAndKeywords() 
with "s" which converts PyUnicode to PyBytes with the utf-8 codec.

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Ezio Melotti

Changes by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Ported tests for nonascii System-Id on 3.x.

If no one objects I'll commit this next week.

--
Added file: http://bugs.python.org/file28714/sax_unicode_fn-3.x.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


Added file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-13 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


Removed file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2013-01-11 Thread Sergey Prokhorov

Changes by Sergey Prokhorov :


--
nosy: +Sergey.Prokhorov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-29 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

However Python doesn't work with bytes filenames (I don't think this is a bug).

The proposed patch allows unicode filenames be used in SAX parser.

--
keywords: +patch
nosy: +serhiy.storchaka
stage:  -> patch review
Added file: http://bugs.python.org/file28268/sax_unicode_fn-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-08 Thread Daniel Urban

Changes by Daniel Urban :


--
type: crash -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2012-12-08 Thread Carsten Grohmann

Changes by Carsten Grohmann :


--
nosy: +cgrohmann

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2011-08-21 Thread John Chandler

John Chandler  added the comment:

Confirmed about not being an issue in Python 3. Just checked with Python 
3.3.0a0 and the example works fine - no exception raised.

--
nosy: +John.Chandler

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11159] Sax parser crashes if given unicode file name

2011-02-09 Thread Rickard Lindberg

New submission from Rickard Lindberg :

The error is the following:

Traceback (most recent call last):
  File "", line 4, in 
  File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/__init__.py", line 
31, in parse
parser.parse(filename_or_stream)
  File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", 
line 109, in parse
xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/xmlreader.py", line 
119, in parse
self.prepareParser(source)
  File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py", 
line 121, in prepareParser
self._parser.SetBase(source.getSystemId())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in 
position 0: ordinal not in range(128)

The following bash script can be used to reproduce the error:

#!/bin/sh

cat > å.timeline <

  0.13.0devb38ace0a572b+
  
  
  

  2011-02-01 00:00:00
  2011-02-03 08:46:00
  asdsd

  
  

  2011-01-24 16:38:11
  2011-02-23 16:38:11



  

EOF

python <>> sys.getfilesystemencoding()
'UTF-8'

I heard from another user that this was not a problem with Python 3.1.2.

--
components: XML
messages: 128212
nosy: ricli85
priority: normal
severity: normal
status: open
title: Sax parser crashes if given unicode file name
type: crash
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com