[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-26 Thread Roundup Robot
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset dc1045d08bd8 by Jason R. Coombs in branch '2.7': Issue #11638: Adding test to ensure .tar.gz files can be generated by sdist command with unicode metadata, based on David Barnett's patch.

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-25 Thread Lars Gustäbel
Lars Gustäbel l...@gustaebel.de added the comment: I think we should wrap this up as soon as possible, because it has already absorbed too much of our time. The issue we discuss here is a tiny glitch triggered by a corner-case. My original idea was to fix it in a minimal sort of way that is

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-25 Thread Jason R. Coombs
Jason R. Coombs jar...@jaraco.com added the comment: I also feel (1) or (3) is best for this issue. If there is a _better_ implementation, it should be reserved for a separate improvement to Python 3.2+. I lean slightly toward (3) because it would support filenames with Unicode characters

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-25 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: As I understand the patched code, it only fixes the issue for unicode names that can be latin-1 encoded and that other unicode names will raise the same exception with 'latin-1' (or equivalent) substituted for 'ascii'. So it is easy for me to

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-25 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: I just took a look as the 3.2 tarfile code and see that it always (because self.name is always unicode) does the same encoding, with 'replace', referencing RFC1952. Although there are a few other differences, they appear inconsequential, so

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-24 Thread Lars Gustäbel
Lars Gustäbel l...@gustaebel.de added the comment: I thought about that myself, too. It is clearly no new feature, it is really more some kind of a fix. Unicode pathnames given to tarfile.open() are just passed through to the open() function, which is why this always has been working, except

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-24 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: With that explanation, that it is one case out of six that fails, for whatever reason, I agree. That leaves the issue of whether the fix is the right one. I currently agree with Victor that we should do what the rest of Python does and what

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-23 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: 2.7 is closed to new features. This looks like it mignt be one. The 2.7 doc for tarfile.open says Return a TarFile object for the pathname name. Does the meaning of 'pathname' in 2.7 generally include unicode as well as str objects? (It is

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread Lars Gustäbel
Lars Gustäbel l...@gustaebel.de added the comment: tarfile under Python 2.x is not particularly designed to support unicode filenames (the gzip module does not support them either), but that should not be too hard to fix. -- keywords: +patch Added file:

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread Jason R. Coombs
Jason R. Coombs jar...@jaraco.com added the comment: That looks like a good patch to me. Do you want to commit it, or would you rather I do? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13639

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread Roundup Robot
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset a60a3610a97b by Lars Gustäbel in branch '2.7': Issue #13639: Accept unicode filenames in tarfile.open(mode=w|gz). http://hg.python.org/cpython/rev/a60a3610a97b -- nosy: +python-dev

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: + self.name = self.name.encode(iso-8859-1, replace) Why did you chose ISO-8859-1? I think that the filesystem encoding should be used instead: -self.name = self.name.encode(iso-8859-1, replace) +self.name =

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread Lars Gustäbel
Lars Gustäbel l...@gustaebel.de added the comment: See http://bugs.python.org/issue11638#msg150029 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13639 ___

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: The gzip format (defined in RFC 1952) allows storing the original filename (without the .gz suffix) in an additional field in the header (the FNAME field). Latin-1 (iso-8859-1) is required. Hum, it looks like the author of the

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-20 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +lars.gustaebel ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13639 ___ ___ Python-bugs-list

[issue13639] UnicodeDecodeError when creating tar.gz with unicode name

2011-12-19 Thread Jason R. Coombs
New submission from Jason R. Coombs jar...@jaraco.com: python -c import tarfile; tarfile.open(u'hello.tar.gz', 'w|gz') produces Traceback (most recent call last): File string, line 1, in module File C:\Users\jaraco\projects\public\cpython\Lib\tarfile.py, line 1687, in open