New submission from Craig de Stigter <craig...@gmail.com>:

Steps to reproduce:

# create a large (>4gb) file
f = open('foo.txt', 'wb')
text = 'a' * 1024**2
for i in xrange(5 * 1024):
    f.write(text)
f.close()

# now zip the file
import zipfile
z = zipfile.ZipFile('foo.zip', mode='w', allowZip64=True)
z.write('foo.txt')
z.close()


Now inspect the file headers using a hex editor. The written headers are 
incorrect. The filesize and compressed size should be written as 0xffffffff and 
the 'extra field' should contain the actual sizes.


Tested on Python 2.5 but looking at the latest code in 3.2 it still looks 
broken.

The problem is that the ZipInfo.FileHeader() is written before the filesize is 
populated, so Zip64 extensions are not written. Later, the sizes in the header 
are written, but Zip64 extensions are not taken into account and the filesize 
is just wrapped (7gb becomes 3gb, for instance).

My patch fixes the problem on Python 2.5, it might need minor porting to fix 
trunk. It works by assigning the uncompressed filesize to the ZipInfo header 
initially, then writing the header. Then later on, I re-write the header (this 
is okay since the header size will not have increased.)

----------
components: Library (Lib)
files: zipfile_zip64_header.patch
keywords: patch
messages: 115250
nosy: craigds
priority: normal
severity: normal
status: open
title: zipfile writes incorrect local file header for large files in zip64
type: behavior
versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3
Added file: http://bugs.python.org/file18685/zipfile_zip64_header.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9720>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to