The reproducible.d.n Jenkins jobs revealed a file ordering issue when we
construct kfreebsd-source-10.1.tar.xz from src:kfreebsd-10 this way:

        tar cfJ $(SOURCE_PACKAGE)/usr/src/$(SRC_TAR) $(SRC_DIR)

What would be a preferred way to build a tar archive with deterministic
file order?

Firstly let me point out that:

$ find foo/ -type f -print0 | sort -z | xargs -0 tar -cvf foo.tar

is risky as the list of filenames could overflow what xargs can supply
to a single invocation of tar.  In that case it would invoke tar again
and clobber the previous output.

Using the xargs -n1 option we can simulate that happening:

$ mkdir foo/
$ touch foo/{a,b,c}
$ find foo/ -type f -print0 | sort -z | xargs -n1 -0 tar -cvf foo.tar
$ tar -tvf foo.tar
-rw-r--r-- steven/steven     0 2015-02-14 20:05 foo/c

I expected the following to work;  it does concatenate the output of tar
into a single file.  But at least with GNU tar, the first invocation of
tar outputs an end-of-file marker, so archives can't be concatenated this
way.  (It might also have not been reproducible, if the archive split
point can vary between systems).  Listing the archive's contents only
returns the first entry:

$ find foo/ -type f -print0 | xargs -n1 -0 tar -cvf - > foo.tar
$ tar -tvf foo.tar
-rw-r--r-- steven/steven     0 2015-02-14 20:05 foo/a

So I came to this:

$ rm -f foo.tar && find foo/ -type f -print0 | xargs -n1 -0 tar -rvf foo.tar

using the --append (-r) option of GNU tar.  It works, but that's not
really ideal, as I'd prefer to pipe the output through xz before writing
anything to disk.  The -r option can't be used with stdout or -J.

Finally I ended up with this:

$ find foo/ -type f -print0 > filelist
$ tar -Jcvf foo.tar.xz -T filelist --null

Does that seem like the neatest way, or do you have better suggestions?

(I thought this problem would be quite common, so I could add it to the
Wiki FAQ).

Steven Chamberlain

Reproducible-builds mailing list

Reply via email to