On Tue, May 06, 2008 at 10:32:28AM -0700, Don Cragun wrote:
> >Detail:
> >
> > The SAM-QFS daemons 'sam-stagerd' and 'sam-arcopy' are currently
> > reading and writing a tape format that is a slightly modified version
> > of a format used by a very old version of GNU tar. The older GNU tar
> > format was modified for SAM-QFS because it didn't satify the project's
> > needs for file size and long file names.
> >
> > The SAM-QFS team would like to modify these daemons to read and write
> > a pax compliant tar format, as described in [1].
> >
> > The default behavior will call for 'sam-arcopy' to continue writing
> > the old-style tar format. The new format can be selected by setting a
> > new 'tar_format = posix' option in the existing SAM-QFS config file,
> > /etc/opt/SUNWsamfs/defaults.conf.
>
> Why isn't this 'tar_format = pax'? The POSIX standard specifies three
The project team agreed to this change.
> > The 'sam-stagerd' daemon will read the tape by validating the magic in
> > the tar header. If the old GNU magic of "ustar[sp][sp][null]" is
> > found then the old format is read. If the new magic of
> > "ustar[null]00" is found then the new format will be read.
>
> In the standard, ustar and pax archive formats have magic as a 6 octet
> field. Am I correct in assuming that you mean magic will be set to
> "ustar" (with a trailing null byte) and version will be set to "00"
> (with no trailing null) as specified for the ustar and pax archive
> formats?
The functional spec has been clarified to say:
The pax format requires that the magic is set to "ustar[null]" and the
version to "00" (which is not null terminated).
I've updated the case with the new spec.
> > If an old version of SAM-QFS attempts to read a new pax format tar
> > header it will declare the file 'damaged', which is an existing
> > SAM-QFS concept. No legacy patch is planned, and this incompatibility
> > will be documented for customers.
> >
> > The SAM-QFS 'sls -D' command and option, a SAM-QFS-aware version of
> > ls(1), will be modified to report the tape format used for the given
> > file, if that file is currently on tape. It will make this decision
> > based on an available bit in the on-disk inode, which is currently
> > zeroed on existing SAM-QFS filesystems.
> >
> > A new versioned library, libpax_xhdr.so, will be added to SAM-QFS.
> > This library is created by abstracting the tar-specific bits out of
> > 'stager' and 'arcopy'.
>
> Rather than factor tar, pax, stager, and arcopy internals into a new
> library, why not just invoke the pax utility to read and write pax
> archives?
The sam-arcopy and sam-stagerd daemons are able to use filesystem-specific I/O
operations to read and write the files to/from the filesystem. This means
files "invisibly" move between the disk and tape, which is a necessary feature
of HSM products such as SAM-QFS to make the disk appear "bottomless".
The sam-arcopy and sam-stagerd daemons are highly threaded to optimize their
particular activity profile, and they use direct I/O. That activity profile
also makes it expensive to fork+exec a tar command for every file request,
because they jump around a tape rather than reading/writing it from start to
finish.
In summary: the current design is that these daemons have the bare-bones bits
necessary to read and write the archives. The project team doesn't want to
redesign these daemons, they just want to add the new header format.
Dean