Re: Zip madness !

jan i Sat, 01 Aug 2015 10:46:25 -0700

On 1 August 2015 at 19:33, Peter Kelly <pmke...@apache.org> wrote:

> Hi Jan,
>
> I’ve just fixed one bug I found (was causing a crash; but valgrind helped
> narrow it down) - a DFextZipDirEntry pointer was being set via incorrect
> pointer entry (see my commit to the newZipExperiment branch for details).
>
thanks I have had this also but not constantly. I will pull your fix,
before I change branch.




>
> After fixing this I got a correct directory listing of a test document I
> created in Word - I only tested it with one file however, so it may not
> address the problem you ran into with the particular test file you
> mentioned.
>
Super, do we have a bigger test document, with loads of files in it ?

rgds
jan i.


>
> —
> Dr Peter M. Kelly
> pmke...@apache.org
>
> PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
>
> > On 1 Aug 2015, at 10:41 pm, Peter Kelly <pmke...@apache.org> wrote:
> >
> > Hi  Jan,
> >
> > I’ll get to your question in a moment, but I just checked out the
> newZipExperiment branch and noticed that almost all of the source files
> have changed (I was expecting a relatively small diff, with only a few
> files changed). It looks like most of these differences are due to
> reordering the #includes at the top of each source file. If we’re going to
> do this, could we make it a separate commit in master, so it’s easier to
> see exactly what has changed in the zip branch?
> >
> > Actually I normally intentionally put system headers after other headers
> in the project, as it helps to detect cases where a custom header depends
> on types declared in a system header, and thus for which importing that
> header (by itself) in a source file would result in compilation errors due
> to the missing references. For example DFBuffer.h has an #include
> <stdarg.h> at the type since some of the functions take the va_list data
> type. If one of us uses such this type in another header which doesn’t have
> #include <stdarg.h>, then any C file that imports it (directly or
> indirectly) has to remember to explicitly include stdarg.h (and that could
> be a *lot* of files, if the header is referenced from lots of places). So
> by placing the any system includes needed by the source file after all
> custom headers, we can pick up on these errors more easily.
> >
> > Regarding the zip file format, I need to look up on some stuff and will
> get back to you shortly. But I suspect some of the duplication may be
> related to the fact that a zip file is meant to be read backwards. Rather
> than starting at the beginning of the file, reading begins at the end,
> working backwards through the file to find potentially multiple copies of
> the directory listing. This serves two purposes:
> >
> > 1) You can “modify” the contents of a zip file simply by appending (with
> the compressed content of new/changed files added, and a new directory
> listing including these files, an *not* including any files which have been
> “deleted”, i.e. masked out).
> >
> > 2) A zip file can be appended to the end of another file format; the
> most common example being self-extracting .exe files. Since .exe files are
> read from the beginning, the program loader on windows doesn’t care about
> the fact that there’s the trailing data at the end. And it’s still a valid
> zip file, since the .exe content at the start is ignored when reading the
> directory listing.
> >
> > I think you may be aware of some of these details already, and there’s
> some nuances I’ve probably missed. I’m about to have a look through the
> code you currently have in the branch.
> >
> > —
> > Dr Peter M. Kelly
> > pmke...@apache.org
> >
> > PGP key: http://www.kellypmk.net/pgp-key <
> http://www.kellypmk.net/pgp-key>
> > (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
> >
> >> On 1 Aug 2015, at 4:33 pm, jan i <j...@apache.org> wrote:
> >>
> >> Hi
> >>
> >> Does anybody know why zip has a mad inefficient directory structure ?
> >>
> >> I try to understand the why, but fail.
> >>
> >> A zip file, contains 1 global directory with information about every
> single
> >> file (flat structure, no
> >> sub directories, but filenames may contain a "/"). That is logical and
> >> expected.
> >>
> >> BUT in front of every file, there are a local file header, with filename
> >> about 3/4 of the information
> >> from the global directory. This information seems pure redundant and
> >> unneeded.
> >>
> >> What am I missing here ? on one of my test docx, the local headers are
> >> about 10% of the filesize (looong filenames) which could be thrown away.
> >>
> >> Hope somebody can see what I failed to see.
> >> rgds
> >> jan i.
> >
>
>

Re: Zip madness !

Reply via email to