[issue24259] tar.extractall() does not recognize unexpected EOF

2015-07-06 Thread Lars Gustäbel
Changes by Lars Gustäbel l...@gustaebel.de: -- resolution: - fixed stage: patch review - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-07-06 Thread Roundup Robot
Roundup Robot added the comment: New changeset 372aa98eb72e by Lars Gustäbel in branch '2.7': Issue #24259: tarfile now raises a ReadError if an archive is truncated inside a data segment. https://hg.python.org/cpython/rev/372aa98eb72e New changeset c7f4f61697b7 by Lars Gustäbel in branch

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-06-30 Thread Lars Gustäbel
Lars Gustäbel added the comment: Martin, I followed your suggestion to raise ReadError. This needed an additional change in copyfileobj() because it is used both for adding file data to an archive and extracting file data from an archive. But I think the patch is in good shape now.

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-06-21 Thread Martin Panter
Martin Panter added the comment: From the current documentation and limited experience with the module, ReadError (or a subclass) sounds best. I would only expect OSError only for OS-level things, like file not found, disk error, etc. The patches look good. One last suggestion is to use

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-06-01 Thread Lars Gustäbel
Lars Gustäbel added the comment: @Martin: This is actually a nice idea that I hadn't thought of. I updated the Python 3 patch to use a seek() that moves to one byte before the next header block, reads the remaining byte and raises an error if it hits eof. The code looks rather clean compared

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-06-01 Thread Lars Gustäbel
Changes by Lars Gustäbel l...@gustaebel.de: Added file: http://bugs.python.org/file39580/issue24259-2.x-2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-30 Thread Thomas Guettler
Thomas Guettler added the comment: With Python 3.4.0 you get an OSError if you try to extractall() the uploaded tar_which_is_cut.tar. That's nice. Seems like only 2.7 seems to be buggy. === python3 Python 3.4.0 (default, Apr 11 2014, 13:05:11) [GCC 4.8.2] on linux Type help, copyright,

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-29 Thread Lars Gustäbel
Lars Gustäbel added the comment: @Thomas: I think your proposal adds a little too much complexity. Also, ExFileObject is not used during iteration, and we would like to detect broken archives without unpacking all the data segments first. I have written patches for Python 2 and 3.

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-29 Thread Thomas Guettler
Thomas Guettler added the comment: I thought about this again. It could be solved with the help of a ByteCountingStreamReader. With ByteCountingStreamReader I mean a wrapper around a stream like codescs.StreamReader. But the ByteCountingStreamReader should not changes the content, but just

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-29 Thread Martin Panter
Martin Panter added the comment: For the record, the difference between Python 2 and 3 is probably a side effect of revision 050f0f7be11e. Python 2 copies data from the ExFileObject returned by extractfile(), while Python 3 copies directly from the underlying file. The patches to the file

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-28 Thread Lars Gustäbel
Lars Gustäbel added the comment: I have written a test for the issue, so that we have a basis for discussion. There are four different scenarios where an unexpected eof can occur: inside a metadata block, directly after a metadata block, inside a data segment or directly after a data segment

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-28 Thread Martin Panter
Martin Panter added the comment: If you are already seeking in the file, can’t you seek to the end to determine the length of the file, and then use that to verify if a data segment is truncated? And if you can’t seek, I guess you have to read all the bytes anyway. I guess Ethan’s test was an

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Thomas Guettler
Thomas Guettler added the comment: Who has enough knowledge of the tarfile module to create a good patch? -- nosy: +guettli ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Martin Panter
Martin Panter added the comment: I might be able to make a patch, but what should the patch do exactly? * Raise an exception as soon as something wrong is found * Defer exceptions until close() is called, to allow partial data recovery * Add some sort of defects API that you can optionally

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Ethan Furman
Ethan Furman added the comment: I ran the OP's code in 2.7, 3.3, 3.4, and 3.5, as well as using ubuntu's gnu tar 1.27.1: gnu tar did not report any errors. Python (all tested versions) did not report any errors (with the errorlevel parameter missing, set to 1, and set to 2). --

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Ethan Furman
Ethan Furman added the comment: On the SO question [1] the OP stated that he tried errorlevel of both 1 and 2 with no effect... however, he was using Python2.6. Martin, can you run that same test with 2.6 to verify that errorlevel did not work there, but does now? [1]

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Martin Panter
Martin Panter added the comment: Actually, looking closer at the module, perhaps you just need to set the errorlevel=1 option: with tarfile.open(truncated.tar, errorlevel=1) as tar:... tar.extractall(test-dir) ... Traceback (most recent call last): File stdin, line 2, in module

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Martin Panter
Martin Panter added the comment: I guess it depends on the particular tar file and where it gets truncated. Just now I tested with a tar file created from Python’s Tools directory: $ tar c Tools/ good.tar $ ls -gG good.tar -rw-r--r-- 1 17397760 May 28 02:43 good.tar $ head -c 13 good.tar

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Ethan Furman
Ethan Furman added the comment: Actually, the OP was using 2.7.6. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___ ___ Python-bugs-list

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-27 Thread Ethan Furman
Ethan Furman added the comment: I took an existing tar file and chopped it in half with `head -c`. I was able to extract half the files, but I didn't check the viability of the last file as I was looking for tar or python error feedback. -- ___

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-21 Thread Ned Deily
Changes by Ned Deily n...@acm.org: -- nosy: +lars.gustaebel ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___ ___ Python-bugs-list mailing

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-21 Thread Thomas Güttler
New submission from Thomas Güttler: The Python tarfile library does not detect a broken tar. user@host$ wc -c good.tar 143360 good.tar user@host$ head -c 13 good.tar cut.tar user@host$ tar -tf cut.tar ... tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now Very

[issue24259] tar.extractall() does not recognize unexpected EOF

2015-05-21 Thread Ethan Furman
Changes by Ethan Furman et...@stoneleaf.us: -- nosy: +ethan.furman ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24259 ___ ___ Python-bugs-list