Martin Panter added the comment: I suspect Eric’s file has non-zero, non-gzip garbage bytes appended to the end of it. Assuming I am right, here is way to reproduce that scenario:
>>> from gzip import GzipFile >>> from io import BytesIO >>> file = BytesIO() >>> with GzipFile(fileobj=file, mode="wb") as z: ... z.write(b"data") ... 4 >>> file.write(b"garbage") 7 >>> file.seek(0) 0 >>> GzipFile(fileobj=file).read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/proj/python/cpython/Lib/gzip.py", line 274, in read return self._buffer.read(size) File "/home/proj/python/cpython/Lib/gzip.py", line 461, in read if not self._read_gzip_header(): File "/home/proj/python/cpython/Lib/gzip.py", line 409, in _read_gzip_header raise OSError('Not a gzipped file (%r)' % magic) OSError: Not a gzipped file (b'ga') This is a bit different to Issue 1508475. That one is about cases where the “gzip” trailer has been truncated, although the compressed data is probably intact. This case is the converse: extra data has been added. All of the “gzip”, “bzip2” and XZ Utils (for LZMA) command-line decompressors happily extract the compressed data without an error exit status, but emit warning messages: gzip: stdin: decompression OK, trailing garbage ignored bzip2: (stdin): trailing garbage after EOF ignored xz: (stdin): Unexpected end of input In Python, the “bzip” and LZMA modules successfully extract the compressed data, and ignore the non-compressed garbage at the end without even a warning. On the other hand, the “gzip” module has special code to ignore trailing zero bytes (Issue 2846), but treats any other trailing non-gzip data as an error. So I think a strong argument could be made for the ability to extract all the compressed data from even if there is garbage appended. The question is, how would this support be added? Perhaps the mechanism chosen could also be integrated with a fix for Issue 1508475. Some options: * Silently ignore the condition by default like the other compression modules (consistent, but could silently swallow real errors) * An optional new GzipFile(strict=False) mode * Perhaps an exception deferred until close() is called ---------- nosy: +vadmium _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24301> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com