Don Cragun <don.cragun at sun.com> wrote: > There are no extensible headers in cpio format. The only place to
In theory, there is a way to extend the cpio header. Glen Fowler and David Korn use this method to add a small amount of extra data to the cpio header. The method they use is to extend the space "occupied" by the file name by setting c_namesize to a value > strlen(pathname) + 1 and to write extensions into this space. On POSIX cpio archives, you may add 262143 bytes - strlen(pathname) - 1. On SVr4 cpio archives, you may add 4294967295 bytes - strlen(pathname) - 1. I would guess that many cpio implementations will dump core in case that this method is used to set c_namesize > 1024, but this is a location that could be used to add a vendor fingerprint that allows to detect the modified archive format in a reliable way. Glen Fowler uses: d<hex number> For a "long" st_rdev in POSIX cpio archives g<hex number> For a "long" group ID. s<hex number> For a "long" file size. u<hex number> For a "long" user ID. G<name> For a group name. U<name> For a user name. All fields are '\0' terminated and the end of the list is a double '\0'. "long" numbers are written out as intmax_t but it seems that they are read back as native C "long" only. I recommend to use: V<vendor> In our case "VSUN" as a marker that this is a cpio archive with Sun extensions. and to report this to Glen Fowler and David Korn. The current pax implementation from AT&T silently skips unknown fields. > store data on where the holes go in cpio format is in the file data > area. The project team had the option of storing hole information > followed by the complete file contents or storing the hold information > followed by the file with the holes removed. Since the cpio format > only has 33 bits to store the size of the file data area, the project > team chose to remove the holes to increase the size of a sparse file > that can be archived in cpio format. While it is true that it would be > possible to encode holes data in a slightly more compact form, it is > nice to have a common format for the holes data in the ustar/pax > extended header records for sparse files and in the data area in the > cpio file data area. The decision to put the hole information into the file data area is OK for me but the simplest way to "better" compress this information is to use data_offset/data_size instead of your current proposal. > >If there are, then should we be deliberately incompatible? > > The star and the recent AT&T pax archivers encode sparse files using > ustar/pax format archives. The project team has not seen any other > attempt to encode sparse files using cpio format. So, there is no > other known cpio format that handles this case. We are not being > deliberately incompatible; nothing else handles this case. If you mark the cpio archives using the proposal from above and if you use comma separated data_offset/data_size pairs to encode the hole list, I would be willing to implement this in star too. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily