Joerg.Schilling at fokus.fraunhofer.de (Joerg Schilling) wrote:

> > >If there are, then should we be deliberately incompatible?
> >
> > The star and the recent AT&T pax archivers encode sparse files using
> > ustar/pax format archives.  The project team has not seen any other
> > attempt to encode sparse files using cpio format.  So, there is no
> > other known cpio format that handles this case.  We are not being
> > deliberately incompatible; nothing else handles this case.
>
> If you mark the cpio archives using the proposal from above and if you
> use comma separated data_offset/data_size pairs to encode the hole list,
> I would be willing to implement this in star too.

Let me make a completely new proposal that does not touch any POSIX defined 
field and that allows to add further extensions in the future. It is based on
extension ideas from David Korn and Glenn Fowler.

If the archive type name is either "ascii_sparse" or "odc_sparse", the string

"VSUN\0\0" is appended to all file names in the cpio header and the field 
c_namesize is filled with a number that is 6 bigger than expected for a standard
cpio archive.

If a sparse file is encountered, the following string is used instead:

"VSUN\0S%jx\0\0", (uintmax_t)statbuf.st_size

c_namesize is filled with a number that is of the apropriate size bigger than
in a standard cpio archive.

In case of a sparse file, the following hole list is used inside the file data 
cunk:

"%jx\n%s", (uintmax_t)number_of_holes_in_list

The "%s" string is replaced by "number_of_holes_in_list" entris of the 
following form:

"%jx %jx\n", (uintmax_t)data_offset, (uintmax_t)data_size

Sparse files that end in a hole are treated the same was as in star.


Using hex numbers instead of decimal numbers is aligned with the AT&T format.
Using hex numbers instead of decimal numbers reduces the size of the hole list 
by 20%. Using "data_size" instead of "hole_offset" reduces the hole list by 
another 20%.

The field c_filesize contains a size that is equal to the compressed file size 
+ the size of the hole list.

Please comment!

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Reply via email to