Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Pádraig Brady
On 22/12/15 17:00, Mike Gerwitz wrote:
> There is ongoing discussion about reproducible builds within GNU.  I'm
> having trouble figuring out the best approach for deterministic
> distribution archives using Automake.

I've not thought much about this, but I'm
wondering about how useful deterministic tarballs are?

The main thrust of reproducible builds is to verify what's
running on the system, and there are so many variables
between the tarball and build, that I'm not sure it's
worth worrying about non determinism in the intermediate steps?

Perhaps the main focus for tarballs should just to
ensure they're properly signed.

cheers,
Pádraig.

p.s. It would be good to give more control to upstream devs
to config archiving options in Makefile.am etc.



Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Mike Gerwitz
There is ongoing discussion about reproducible builds within GNU.  I'm
having trouble figuring out the best approach for deterministic
distribution archives using Automake.

Here's my original message on gnu-prog-discuss:

> I did read https://reproducible-builds.org/docs/archives/.
>
> Automake-generated Makefiles have many archive options.  I'm assuming
> that my best option is to modify the timestamps and other metadata of
> the files in distdir using `dist-hook`, but that doesn't solve file
> ordering.
>
> What would the GNU recommendation be in this case, and what fits best
> with the spirit of Automake?  Post-processing the tarball is awkward
> since it is part of a pipeline (to whatever compression algorithm is
> chosen for the final archive).  I'm not sure how to modify am__tar to
> include processing as part of that pipeline (e.g. as used in
> dist-gzip)---Automake doesn't provide options to configure its value
> outside of _AM_PROG_TAR, which is rigid.
>
> strip-nondeterminism appears to support ar, gzip, jar, and zip; should I
> just use that?


Ludo had some suggestions:

On Tue, Dec 22, 2015 at 17:23:55 +0100, Ludovic Courtès wrote:
> At the very least, Automake should change the default value of
> ‘GZIP_ENV’ to “--best --no-name” (the latter tells gzip to not add a
> timestamp in its output.)
>
> Ideally ‘make dist’ would also sort files in the archives.  Recent
> versions of GNU tar support ‘--sort=name’ but we’d need a way to do that
> portably (or require GNU tar for ‘make dist’.)
>
> Lastly, archive timestamps could be reset, as per --mtime=@0, but again,
> portability needs to be considered.  In some cases, this feature might
> need to be turned off.
>
> Thoughts?


Is there a [good] way to solve this problem until we can implement any
suggestions in Automake?

-- 
Mike Gerwitz
Free Software Hacker | GNU Maintainer
https://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB


signature.asc
Description: PGP signature


Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Warren Young
On Dec 22, 2015, at 2:51 PM, Bob Friesenhahn  
wrote:
> 
> Attempting to get archiving tools to produce the same results at different 
> times on different machines is close to impossible

Fortunately, others have already done much of the hard work:

  https://reproducible-builds.org/




Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Ludovic Courtès
Pádraig Brady  skribis:

> On 22/12/15 17:00, Mike Gerwitz wrote:
>> There is ongoing discussion about reproducible builds within GNU.  I'm
>> having trouble figuring out the best approach for deterministic
>> distribution archives using Automake.
>
> I've not thought much about this, but I'm
> wondering about how useful deterministic tarballs are?
>
> The main thrust of reproducible builds is to verify what's
> running on the system, and there are so many variables
> between the tarball and build, that I'm not sure it's
> worth worrying about non determinism in the intermediate steps?
>
> Perhaps the main focus for tarballs should just to
> ensure they're properly signed.

You’re right that deterministic tarballs are not the immediate concern
of reproducible builds; usually, we focus on binaries.

However, if running ‘make dist’ at a given commit of a project leads to
exactly one tarball, then people can verify the tarball against the VCS
commit.  This is especially interesting when people sign commits/tags.
We could authenticate code with much finer grain.

This also reduces incentives to attack the person that runs ‘make dist’
and signs the result since anyone could independently check the tarball.

Basically same motivation as with reproducible builds, but one level
higher.

Ludo’.



Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Bob Friesenhahn

On Tue, 22 Dec 2015, Pádraig Brady wrote:


On 22/12/15 17:00, Mike Gerwitz wrote:

There is ongoing discussion about reproducible builds within GNU.  I'm
having trouble figuring out the best approach for deterministic
distribution archives using Automake.


I've not thought much about this, but I'm
wondering about how useful deterministic tarballs are?

The main thrust of reproducible builds is to verify what's
running on the system, and there are so many variables
between the tarball and build, that I'm not sure it's
worth worrying about non determinism in the intermediate steps?

Perhaps the main focus for tarballs should just to
ensure they're properly signed.


I would agree that it is the extracted binary contents of the tarballs 
(ignoring artifacts like file timestamps and user ids) which counts. 
Attempting to get archiving tools to produce the same results at 
different times on different machines is close to impossible.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

Re: [gnu-prog-discuss] Automake dist reproducibility

2015-12-22 Thread Warren Young
On Dec 22, 2015, at 12:16 PM, Pádraig Brady  wrote:
> 
> On 22/12/15 17:00, Mike Gerwitz wrote:
>> There is ongoing discussion about reproducible builds within GNU.
> 
> I’m wondering about how useful deterministic tarballs are?

This page gives the “whys” of reproducible builds:

  https://wiki.debian.org/ReproducibleBuilds/About

> Perhaps the main focus for tarballs should just to
> ensure they're properly signed.

Signing only proves that the package provider possesses the private key, which 
implies — but does not prove — that the signer is the party you expect the 
packages to come from.

The security risk is that if someone can steal the private key, they can sign 
arbitrary packages.

But, if you can independently create the same pre-signature tarball from the 
source package, you can prove conclusively that the source code is the same 
used for creating that binary package.

This does not prove that the source code hasn’t also been compromised, but once 
you’ve reduced the verification problem to the source level, you can use 
traditional high-level means of verification: diffing against previous source 
releases, diffing against the project’s public source repo, auditing the 
source, etc.