New submission from Vinay Sajip:
The attached file failing.tar.gz contains a path with UTF-8-encoded Unicode.
This causes extractall() to fail, but only when the destination path is
Unicode. That's because it leads to a implicit str->unicode conversion using
ASCII.
Test script:
import shutil, tarfile, tempfile
tf = tarfile.open('failing.tar.gz', 'r:gz')
workdir = tempfile.mkdtemp()
try:
# N.B. ensure dest path is Unicode to trigger the failure
tf.extractall(unicode(workdir))
finally:
shutil.rmtree(workdir)
Result:
$ python untar.py
Traceback (most recent call last):
File "untar.py", line 8, in <module>
tf.extractall(unicode(workdir))
File "/usr/lib/python2.7/tarfile.py", line 2046, in extractall
self.extract(tarinfo, path)
File "/usr/lib/python2.7/tarfile.py", line 2083, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
File "/usr/lib/python2.7/posixpath.py", line 71, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 44:
ordinal not in range(128)
----------
components: Library (Lib), Unicode
messages: 181631
nosy: ezio.melotti, vinay.sajip
priority: normal
severity: normal
status: open
title: tarfile extract fails when Unicode in pathname
type: behavior
versions: Python 2.7
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue17153>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com