On Feb 2, 2008 1:43 AM, Danek Duvall <[EMAIL PROTECTED]> wrote:
> On Fri, Feb 01, 2008 at 10:08:05PM -0600, Mike Gerdts wrote:
>
> > I would like to see something different than tar, because...
> >
> > - The logical thing to do with a tar file is to compress it.  If you
> >   are working on a system with slow single thread performance,
> >   processing compressed tar (or cpio) archives is extremely slow.
>
> We could, of course, compress each file in the archive individually, which
> would allow for multiple threads to do the decompression in parallel.  Of
> course, you still have to read through the archivve to find out where
> things are, unless you tack on a header that contains the table of
> contents.

The header could simply give the offset and length of the toc.  A byte
range request (or lseek if on-disk) could be used to retrieve the toc.
 This would cause problems for tape, but I think that the days of
delivery of installable software on tape are far behind us.

> > >From my recollection of the zap discussion and my own frustration with
> > single-threaded performance on US-T1 systems, I would like to see:
> >
> > - All metadata at the beginning of the file.  This is helpful when the
> >   file is delivered via a standard HTTP download (no IPS server) to
> >   determine which byte ranges need to be downloaded.
>
> This is one of my big beefs with zip-based formats.  There's no reason that
> a smart protocol couldn't download the TOC first, place it at the end of a
> holey file, and fill in the rest afterwards.  Seems a bit icky, but at
> least opensource zip tools are very widely available (including the python
> module), so we wouldn't have to write anything new.

If you are going to download the entire file, breaking it up like this
may be pointless.  If a package has 32-bit and 64-bit executables for
sparc and x86 along with 100 localizations, you likely will only need
to download a small subset of the files to install the software onto
your  laptop that only you use.  In that case downloading (or
otherwise reading) the toc then the appropriate byte ranges from
within the package would be appropriate

This use case may speak to how the tool that creates the archive
arranges content.  Grouping content that is used in similar
circumstances together could have tremendous benefit in reduced
latency for packages retrieved from HTTP or optical media.

--
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to