New submission from Philippe: The extraction fails when calling tarfile.open using this archive: http://archive.apache.org/dist/commons/logging/source/commons-logging-1.1.2-src.tar.gz
After some investigation, the file can be extracted with gnu tar and bsdtar and the gzip compression is not the issue: if I gunzip the tar.gz to a tar and call tarfile on plain tar, the problem is the same. Also this archive was created most likely on Windows (based on the `file` command output) using some Java tools per http://commons.apache.org/proper/commons-logging/building.html from these original files: http://svn.apache.org/repos/asf/commons/proper/logging/tags/LOGGING_1_1_2/ ... that's all I could find out. The error trace is slightly different on 2.7 and 3.4 but similar. The problem has been verified on Linux 64 with Python 2.7 and 3.4 and on Windows with Python 2.7. On 2.7: >>> TarFile.taropen(name) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/tarfile.py", line 1705, in taropen return cls(name, mode, fileobj, **kwargs) File "/usr/lib/python2.7/tarfile.py", line 1574, in __init__ self.firstmember = self.next() File "/usr/lib/python2.7/tarfile.py", line 2335, in next raise ReadError(str(e)) tarfile.ReadError: invalid header On 3.4: >>> TarFile.taropen(name) Traceback (most recent call last): File "/usr/lib/python3.4/tarfile.py", line 180, in nti n = int(nts(s, "ascii", "strict") or "0", 8) ValueError: invalid literal for int() with base 8: ' ' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.4/tarfile.py", line 2248, in next tarinfo = self.tarinfo.fromtarfile(self) File "/usr/lib/python3.4/tarfile.py", line 1083, in fromtarfile obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors) File "/usr/lib/python3.4/tarfile.py", line 1032, in frombuf obj.uid = nti(buf[108:116]) File "/usr/lib/python3.4/tarfile.py", line 182, in nti raise InvalidHeaderError("invalid header") tarfile.InvalidHeaderError: invalid header During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.4/tarfile.py", line 1595, in taropen return cls(name, mode, fileobj, **kwargs) File "/usr/lib/python3.4/tarfile.py", line 1469, in __init__ self.firstmember = self.next() File "/usr/lib/python3.4/tarfile.py", line 2260, in next raise ReadError(str(e)) tarfile.ReadError: invalid header ---------- components: Library (Lib) files: commons-logging-1.1.2-src.tar.gz messages: 245839 nosy: lars.gustaebel, pombreda priority: normal severity: normal status: open title: tarfile fails to extract archive (handled fine by gnu tar and bsdtar) versions: Python 2.7, Python 3.4 Added file: http://bugs.python.org/file39814/commons-logging-1.1.2-src.tar.gz _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24514> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com