Daniel Hillier <daniel.hill...@gmail.com> added the comment:

Could something similar be achieved by looking for the earliest file header 
offset?

def find_earliest_header_offset(zf):
    earliest_offset = None
    for zinfo in zf.infolist():
        if earliest_offset is None:
            earliest_offset = zinfo.header_offset
        else:
            earliest_offset = min(zinfo.header_offset, earliest_offset)
    return earliest_offset


You could also adapt this using

    zinfo.compress_size + len(zinfo.FileHeader())

to see if there were any sections inside the archive which were not referenced 
from the central directory. Not sure if zip files with arbitrary bytes inside 
the archive would be valid everywhere, but I think they are using zipfile.

You can also have zipped content inside an archive which has a valid fileheader 
but no reference from the central directory. Those entries are discoverable by 
implementations which process content serially from the start of the file but 
not implementations which rely on the central directory.

----------
nosy: +dhillier

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40301>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to