Karen Etheridge <p...@froods.org> writes:

>  I believe the paragraph in the docs should stay, but change the MUST to
> a SHOULD, with a proviso that there should be a way to disable it (for
> the purposes of repeatable builds etc). If the paragraph is removed
> entirely, no one will implement it (the fact that it is not
> well-implemented now is sad, but beside the point). I have no strong
> feelings as to whether the option should default to on or off, but the
> option should exist for those that wish this extra content.

The tricky thing about reproducible builds is that the folks who are
attempting to do them are doing mass numbers of packages at once, and are
generally trying to do it without changing the package build system (which
sort of defeats the point).  So having an option to disable
non-reproducible behavior can be a bit tricky: it has to be accessible to
their build infrastructure (which usually means an environment variable),
and it also means that the default build is not reproducible, thus partly
undermining the security benefits.

Sometimes the benefits of whatever is making the build non-reproducible
are significant enough that this is all worth it.  I think that's the case
for the date embedded in manual pages, for example, since that provides
users with a valuable hint about when the documentation was last modified,
and there's no other good way to get that date, if not explicitly
provided, other than using the file modification timestamp (something
that's not always reproducible).  For that case, I left that the default
behavior, but added support for the SOURCE_DATE_EPOCH environment variable
[1] to override any dates derived from file timestamps.  This has the
advantage of being a generic setting that can be adopted by a variety of
different software with similar needs.  Debian automatically sets that
environment variable to the date of the last Debian changelog entry in its
build system, so that means Debian packages no long require special
configuration to be reproducible in that way.

[1] https://reproducible-builds.org/docs/source-date-epoch/

Versions seem like both a less important and a trickier issue.  On one
hand, reproducible builds in the classic security sense only make sense
when using the same versions of all tools, in which case pure version
information isn't a problem.  This is supported by [2].  (Some of the
other things perlpodspec currently suggest, such as the current time or
the name of the input file, would be more of a problem.)

[2] https://reproducible-builds.org/docs/version-information/

On the other hand, it's still useful to be able to compare the output
between build systems using, say, two different point releases of Perl to
see if there are any significant differences, and the embedded version
numbers add a lot of noise.  I admit that I've frequently annoyed myself
by embedding version numbers in comments in output and then having to
write tedious scripts to filter those version numbers out again so that I
can compare output between old and new versions of tools looking for
significant differences.  I kind of want to tell my future self to stop
doing that, although there is a trade-off between that and having useful
provenance information.

On a practical level, there doesn't seem to be a good generic environment
variable already in use that says "omit provenance information that may
not be significant"; the only two environment variables that have been
standardized for this purpose are SOURCE_DATE_EPOCH and
BUILD_PATH_PREFIX_MAP.  That means that if we want to leave it on by
default but provide an easy way to turn it off, we'd have to make up a new
environment variable, which would at least start as specific to Perl and
would be another thing people had to somehow learn about and track.

For that reason, I'm currently leaning towards dropping this information
by default.  I'm not sure that I want to add a flag to add it back in.
Every new flag is some small amount of ongoing support and documentation
complexity overhead, and it's been a long time since I've used this
information to understand a bug report.

For perlpodspec, based on this discussion, I propose the following fairly
minimal change.  (I've never had a reason to submit a PR against Perl
before; now seems like a good time to start.)

>From ac979b8560449d363d1b3309ef5b80aa00b4b8f2 Mon Sep 17 00:00:00 2001
From: Russ Allbery <ea...@eyrie.org>
Date: Sun, 24 Mar 2024 12:13:18 -0700
Subject: [PATCH] Adjust perlpodspec rules for formatter comments

Relax the must requirement that Pod formatters embed version
information in a comment to "may." Add a "should not" rule against
comments containing information that would cause otherwise-identical
output to change unnecessarily.
---
 pod/perlpodspec.pod | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod
index ca09d28555..d7387a28d0 100644
--- a/pod/perlpodspec.pod
+++ b/pod/perlpodspec.pod
@@ -695,7 +695,7 @@ formatter will nevertheless treat them the same.)
 =item *
 
 When rendering Pod to a format that allows comments (i.e., to nearly
-any format other than plaintext), a Pod formatter must insert comment
+any format other than plaintext), a Pod formatter may insert comment
 text identifying its name and version number, and the name and
 version numbers of any modules it might be using to process the Pod.
 Minimal examples:
@@ -709,9 +709,11 @@ Minimal examples:
  .\" Pod::Man version 3.14159, using POD::Parser version 1.92
 
 Formatters may also insert additional comments, including: the
-release date of the Pod formatter program, the contact address for
-the author(s) of the formatter, the current time, the name of input
-file, the formatting options in effect, version of Perl used, etc.
+contact address for the author(s) of the formatter, the formatting
+options in effect, version of Perl used, etc. Formatters should avoid
+comments that would cause otherwise-identical output to change
+unnecessarily, such as the current time or the full path to the input
+file.
 
 Formatters may also choose to note errors/warnings as comments,
 besides or instead of emitting them otherwise (as in messages to
-- 
2.43.0

-- 
#!/usr/bin/perl -- Russ Allbery, Just Another Perl Hacker
$^=q;@!>~|{>krw>yn{u<$$<[~||<Juukn{=,<S~|}<Jwx}qn{<Yn{u<Qjltn{ > 0gFzD gD,
 00Fz, 0,,( 0hF 0g)F/=, 0> "L$/GEIFewe{,$/ 0C$~> "@=,m,|,(e 0.), 01,pnn,y{
rw} >;,$0=q,$,,($_=$^)=~y,$/ C-~><@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print

Reply via email to