Russ Allbery:
> Ximin Luo <infini...@debian.org> writes:
> 
>> Fair enough. I actually spotted that but thought it was better to get
>> "something" into Policy rather than nitpick. I guess other people were
>> thinking similar things. Well, lesson learnt, I will be more forceful
>> next time.
> 
>> The sentence I amended said "most environment variables" so our intent
>> is clear. If we want to fix this now, I would suggest amending:
> 
>> - a set of environment variable values; and
>> + a set of reserved environment variable values; and
> 
>> then later:
> 
>> + A "reserved" environment variable is defined as DEB_*, DPKG_, 
>> SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by 
>> dpkg-buildflags and other variables explicitly used by buildsystems to 
>> affect build output, excluding any variables used by non-build programs to 
>> affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER, 
>> PATH and likely any variables ending with *PATH.
> 
> We intentionally didn't spell this out in this much detail because it felt
> better to defer this (stricter) bar until we have documentation of the
> *.buildinfo file, and also because we were worried about the list changing
> (once it goes into Policy, it's more irritating to change).  The current
> standard in Policy is intentionally weaker than this in order to be
> simpler.
> 
> I still lean towards taking this approach, because I'm pretty worried
> about the scope of:
> 
>     other variables explicitly used by buildsystems to affect build output
> 
> That's not really an enumerable list.  My recommendation, if you want to
> allow some environment variables to vary without affecting
> reproducibility, is to explicitly list the set of environment variables
> that can vary, rather than trying to list the ones that have to remain
> fixed.
> 

Intuitively it feels weird to say "if you vary USER, the output must remain 
fixed", but also "if you vary RANDOMUNIQUESPECIALSNOWFLAKEVARIABLE then the 
output is allowed to change".

Certain environment variables have become convention to affect a build, like 
CFLAGS, and even debuild(1) doesn't clear them - but clears the other envvars. 
That is what I was going on.

> But, more fundamentally, I'm dubious that weakening the environment
> variable set is a good use of anyone's time.  Why not define reproducible
> builds as setting a specific set of environment variables and no others?
> We're long past the point where building packages in an isolated
> environment with a fixed set of environment variables is a great hardship
> or even particularly unusual.  I think the effort would be better spent on
> fixing (with enumerated exceptions) the set of environment variables set
> by buildds, sbuild, pbuilder, and other infrastructure that builds
> packages than in making packages tolerate random environment variables
> being set during the build.  It's really hard to track down all the
> environment variable settings that might affect Autoconf, the build tools,
> document formatters, and so forth.
> 

My proposal was the opposite, to *strengthen* the definition that was already 
accepted - I *don't* think we should track down all those variables and make 
packages immune to them, that is why I added "other variables explicitly used 
by buildsystems to affect build output" etc. OTOH, some other variables are 
used by non-build tools, such as LC_*, USER, etc. Since they affect non-build 
programs, they possibly may be set in a developer's normal environment, so just 
running "debian/rules build" will pick these up. Then, the build should stay 
the same despite these other variables.

If a build tool needs to be run in a specific locale, it should either use a 
locale-independent sorting program, or set LC_ALL explicitly itself regardless 
of what the parent environment says.

This doesn't contradict us from using a fixed or mostly-clean environment in 
sbuild, pbuilder, debuild, etc.

Now that I think about it however, it's probably not reasonable to expect that 
the output remains the same when PATH is changed. On tests.r-b.org we vary it 
by appending a dummy value [1] but if the user adds their own stuff to the 
beginning then the output may well change. There is probably no point in trying 
to prevent that in all packages. In a sense, it does very much affect what 
build tools are run, even though non-build programs also use it. However, my 
gut feeling still says that it's not right for the locale (LC_*) to affect a 
build process. I will try to think of a more precise way to express this 
difference.

X

[1] https://tests.reproducible-builds.org/debian/index_variations.html

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

Reply via email to