Patches item #1159051, was opened at 2005-03-08 14:57
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1159051&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Wummel (calvin)
Assigned to: Nobody/Anonymous (nobody)
Summary: Handle corrupted gzip files with unexpected EOF

Initial Comment:
The GzipFile algorithm crashes when reading a corrupted
.gz file (attached as t.gz) with a missing CRC checksum
at the end.
Tested with python2.3, python2.4 and CVS python on a
Debian Linux system.
$ python2.3 t.py
Traceback (most recent call last):
  File "t.py", line 4, in ?
    print gzip.GzipFile('', 'rb', 9, fileobj).read()
  File "/usr/lib/python2.3/gzip.py", line 217, in read
    self._read(readsize)
  File "/usr/lib/python2.3/gzip.py", line 289, in _read
    self._read_eof()
  File "/usr/lib/python2.3/gzip.py", line 305, in _read_eof
    crc32 = read32(self.fileobj)
  File "/usr/lib/python2.3/gzip.py", line 40, in read32
    return struct.unpack("<l", input.read(4))[0]
struct.error: unpack str size does not match format

The attached patch (against current CVS) tries to cope
with this situation by
a) detecting the missing data by examining the rewind
value and
b) assuming that EOF is reached and returning the
buffered uncompressed data (by raising EOFError)

For history I encountered this kind of bug when
downloading HTML pages with Content-Encoding: gzip. It
seems some versions of the mod_gzip Apache module are
producing corrupted gzip data.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2007-03-06 14:02

Message:
Logged In: YES 
user_id=21627
Originator: NO

I don't think it is right to treat this as EOF. I haven't fully understood
the problem (can you explain the format error encountered: what is
expected, what is sent instead?), however, I notice that gzip itself also
complains about an "unexpected EOF" error, so I don't think we should
silently decompress the file. Producing an exception indicating the proper
problem and including the data received so far in it would be more
appropriate.

Also, can you provide an example file for the test suite that isn't
copyrighted by somebody else?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1159051&group_id=5470
_______________________________________________
Patches mailing list
[email protected]
http://mail.python.org/mailman/listinfo/patches

Reply via email to