Massimo Sala <massimo.sala...@gmail.com> added the comment:
I choosed to use the internal variable *concat* because - if I recollect correctly, it is calculated before successive routines; - I didn't see your solution (!), there is a very nice computed variable in front of my eyes. Mmh 1) Reliability Cannot be sure this always run with malformed files : for zinfo in zf.infolist(): We can try / except but we loose the computation. If *concat* is already computed (unless completely damaged files), IMHO my solution is better. 2) Performance What are the performance for big files? Are there file seeks due to traversing zf.infolist() ? > Daniel wrote: > the advantage is that it already works in python 2.7 so there is no need to patch Python Yes, indeed. If I am right about the pros of my patch, I stand for it. Many thanks for you attention. On Sat, 18 Apr 2020 at 15:45, Daniel Hillier <rep...@bugs.python.org> wrote: > > Daniel Hillier <daniel.hill...@gmail.com> added the comment: > > Hi Massimo, > > Unless I'm missing something about your requirements, the advantage is that > it already works in python 2.7 so there is no need to patch Python. Just > bundle the above function with your analysis tool and you're good to go. > > Cheers, > Dan > > On Sat, Apr 18, 2020 at 11:36 PM Massimo Sala <rep...@bugs.python.org> > wrote: > > > > > Massimo Sala <massimo.sala...@gmail.com> added the comment: > > > > Hi Daniel > > > > Could you please elaborate the advantages of your loop versus my two > lines > > of code? > > I don't grasp... > > > > Thanks, Massimo > > > > On Sat, 18 Apr 2020 at 03:26, Daniel Hillier <rep...@bugs.python.org> > > wrote: > > > > > > > > Daniel Hillier <daniel.hill...@gmail.com> added the comment: > > > > > > Could something similar be achieved by looking for the earliest file > > > header offset? > > > > > > def find_earliest_header_offset(zf): > > > earliest_offset = None > > > for zinfo in zf.infolist(): > > > if earliest_offset is None: > > > earliest_offset = zinfo.header_offset > > > else: > > > earliest_offset = min(zinfo.header_offset, earliest_offset) > > > return earliest_offset > > > > > > > > > You could also adapt this using > > > > > > zinfo.compress_size + len(zinfo.FileHeader()) > > > > > > to see if there were any sections inside the archive which were not > > > referenced from the central directory. Not sure if zip files with > > arbitrary > > > bytes inside the archive would be valid everywhere, but I think they > are > > > using zipfile. > > > > > > You can also have zipped content inside an archive which has a valid > > > fileheader but no reference from the central directory. Those entries > are > > > discoverable by implementations which process content serially from the > > > start of the file but not implementations which rely on the central > > > directory. > > > > > > ---------- > > > nosy: +dhillier > > > > > > _______________________________________ > > > Python tracker <rep...@bugs.python.org> > > > <https://bugs.python.org/issue40301> > > > _______________________________________ > > > > > > > ---------- > > > > _______________________________________ > > Python tracker <rep...@bugs.python.org> > > <https://bugs.python.org/issue40301> > > _______________________________________ > > > > ---------- > > _______________________________________ > Python tracker <rep...@bugs.python.org> > <https://bugs.python.org/issue40301> > _______________________________________ > ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40301> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com