Bug#876055: Environment variable handling for reproducible builds

Vagrant Cascadian Mon, 18 Sep 2017 18:09:47 -0700

On 2017-09-18, Russ Allbery wrote:
> Daniel Kahn Gillmor <d...@fifthhorseman.net> writes:
>> On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:
>
>>> I personally lean towards 2, which is consistent with what's in Policy
>>> right now, but I can see definite merits in 3.  I believe the
>>> reproducible builds project is currently sort of doing 1, but I have a
>>> hard time seeing how to make that viable on the testing side.
>
>> Thanks for raising this question, Russ!


Indeed!


>> I'm not sure that we should let lack of exhaustive testing push us away
>> from (1).  (1) is in principle the right thing -- it's easy to make a
>> build reproducible if we tell people that they have to do exactly one
>> specific thing.  But we generally want people to be able to run
>> heterogenous systems, and not to force them into one particular
>> environment.
>
> Well... I would argue that the amount of time and effort that's gone into
> this project shows that it's not that easy to make a build reproducible
> even when telling people to do exactly one thing.  :)  But I get your
> point.

Much of the work has already been done by aspirational, principled
folks... :)


>> Does everything in policy need to be rigorously testable?  or is it ok
>> to have Policy state the desired outcome even if we don't know how (or
>> don't have the resources) to test it fully today.
>
> I don't think everything has to be rigorously testable, but I do think
> it's a useful canary.  If I can't test something, I start wondering
> whether that means I have problems with my underlying assumptions.
>
> In particular, for (1), we have no comprehensive list of environment
> variables that affect the behavior of tools, and that list would be
> difficult to create.  Many pieces of software add their own environment
> variables with little coordination, and many of those variables could
> possibly affect tool output.

There is a huge difference between variables that *might* affect the
build as an unintended input that gets stored in a resulting packages in
some manner, and variables that are designed to change the behavior of
parts of the build toolchain.

I consider unintended variables that affect the build output a bug, and
variables designed and intended to change the behavior of the toolchain
expected, reasonable behavior.


> I feel like the work for (1) and for (3) ends up being comparable; for (1)
> we have to maintain a blacklist, and for (3) we have to maintain a
> whitelist.  But (3) is testable, whereas (1) is inherently aspirational
> and will always have to be aspirational.  We're endlessly going to be
> discovering some other environment variable that changes tool output.

Well, there can be a testable, automatable standard, and a higher,
aspirational standard in parallel.

Which largely seems consistant with what's already in policy... but I'm
not sure it's appropriate to codify these whitelists or blacklists in
policy.


> I'm also unsure that (1) is even what we want to claim.  Do we really want
> to say that builds are always reproducible if you don't change this short
> list of environment variables, no matter whatever other environment
> variables you set?

I don't think we want to make absolute claims; reproducible builds is
about having greater confidence that the binaries are produced from the
source, not absolute confidence.

The ideal is to have as many builds as possible corroborated from a
diverse group of build machines, developers, third-parties,
sophisticated end-users, legal jurisdictions, etc.


> There's some appeal in this for the end user, but it
> feels very frustrating for the package maintainer.  At first glance, as a
> package maintainer, I'd think I'd have to maintain a huge blacklist of
> environment variables that I've discovered affect my toolchain somewhere,
> and explicitly unset them all in debian/rules.  This doesn't feel like a
> good use of anyone's time (and may actually *break* other,
> non-reproducibility-related things that people want to do with my
> package).

In practice, for the vast majority of packages in Debian, it is a
relatively small number of environment variables to get fairly solid
reproducibility coverage... at least from what we've seen so far.

The hard part is actually continuing to tease them out...


live well,
  vagrant

signature.asc
Description: PGP signature

Bug#876055: Environment variable handling for reproducible builds

Reply via email to