Hi! We are working in Debian— and I know other free software projects care— in providing our users with a way to reproduce bit-for-bit identical binary packages from the source and build enviroment. See <https://wiki.debian.org/ReproducibleBuilds/About> for some rationale and further explainations.
In order to do this, we need to make our build processes as
deterministic as possible. As you can imagine, Tar is quite involved in
producing Debian packages. A straightforward call leads to multiple
issues:
* Order of files in the archive will depend on the filesystem order.
* User and group names are recorded. This can be seen as a privacy leak
for the package builder.
* Permissions are dependent on the builder umask.
* Last modification times of members of files created during the build
will be dependent on the build time.
* Also, if gzip compression is used, a timestamp will be recorded in
gzip header.
So, we are currently turning calls like:
tar -zcf archive.tar.gz src
into:
find src -print0 | LC_ALL=C sort -z |
GZIP=-9n tar --null -T - --no-recursion \
--owner=root --group=root --numeric-owner \
--mode=go=rX,u+rw,a-s \
--mtime=debian/changelog \
-zcf archive.tar
It would be great to avoid at least some of the boilerplate. Finding a
generic solution for permissions and modification times might be too
much, but having a `--deterministic` flag for the rest of the issues
would be quite helpful already.
What do you think?
--
Lunar .''`.
[email protected] : :Ⓐ : # apt-get install anarchism
`. `'`
`-
signature.asc
Description: Digital signature
