On Mon, Apr 1, 2024, at 2:04 PM, Russ Allbery wrote:
> "Zack Weinberg" <z...@owlfolio.org> writes:
>> It might indeed be worth thinking about ways to minimize the
>> difference between the tarball "make dist" produces and the tarball
>> "git archive" produces, starting from the same clean git checkout,
>> and also ways to identify and audit those differences.
>
> There is extensive ongoing discussion of this on debian-devel. There's
> no real consensus in that discussion, but I think one useful principle
> that's emerged that doesn't disrupt the world *too* much is that the
> release tarball should differ from the Git tag only in the form of
> added files. Any files that are present in both Git and in the release
> tarball should be byte-for-byte identical.

That dovetails nicely with something I was thinking about myself.
Obviously the result of "make dist" should be reproducible except for
signatures; to the extent it isn't already, those are bugs in automake.
But also, what if "make dist" produced *two* disjoint tarballs? One of
which is guaranteed to be byte-for-byte identical to an archive of the
VCS at the release tag (in some clearly documented fashion; AIUI, "git
archive" does *not* do what we want).  The other contains all the files
that "autoreconf -i" or "./bootstrap.sh" or whatever would create, but
nothing else.  Diffs could be provided for both tarballs, or only for
the VCS-archive tarball, whichever turns out to be more compact (I can
imagine the diff for the generated-files tarball turning out to be
comparable in size to the generated-files tarball itself).

This should make it much easier to find, and therefore audit, the pre-
generated files, and to validate that there's no overlap. It would add
an extra step for people who want to build from tarball, without having
to install autoconf (or whatever) first -- but an easier extra step
than, y'know, installing autoconf. :)  Conversely, people who want to
build from tarballs but *not* use the pre-generated configure, etc,
could now download the 'bare' tarball only.

("Couldn't those people just build from a git checkout?"  Not if they
don't have the tooling for it, not during early stages of a distribution
bootstrap, etc.  Also, the act of publishing a tarball that's a golden
copy of the VCS at the release tag is valuable for archival purposes.)

zw

Reply via email to