On 2023-04-26, James Addison wrote: > On Wed, 26 Apr 2023 at 18:48, Vagrant Cascadian > <[email protected]> wrote: >> >> On 2023-04-26, James Addison wrote: >> > On Tue, 18 Apr 2023 at 18:51, Vagrant Cascadian >> > <[email protected]> wrote: >> >> > James Addison <[email protected]> wrote: >> >> This is why in the reproducible builds documentation on timestamps, >> >> there is a paragraph "Timestamps are best avoided": >> >> >> >> https://reproducible-builds.org/docs/timestamps/ >> >> >> >> Or as I like to say "There are no timestamps quite like NO timestamps!" >> > >> > I see a parallel between the use of timestamps as a key for >> > data-lookup (as in Holger's developers-reference package), and the use >> > of locale as a similar data-lookup key (as in the case of localised >> > documentation builds). >> >> > I'm not sure what the equivalent approach is for localisation, though. >> > Command-line software, for example, requires at least one written >> > natural-language to be usable, and as a second use case, providing >> > natural-language documentation with software is highly recommended (is >> > it part of the software? maybe not. but a sufficiently-confusing >> > poorly-translated error message could be as serious as a code-related >> > bug, I think?). >> > >> > Linking back to my recent experience with Sphinx, and from the >> > perspective of allowing-users-to-verify-their-software, I'd tend to >> > think that an ideally-produced, reproducible, localised software would >> > include _all_ available translations in the build artifact. Some of >> > that could be retrieved at runtime (gettext, for example), and some >> > could be static (file-backed HTML documentation, where runtime lookups >> > might not be so straightforward). >> >> I struggle to see the parallel. A timestamp is an arbitrary value based >> on when you built it, whereas the locale-rendered document should be >> reproducibly translated based on the translations you have available at >> the time you run whatever process generates the translated version of >> the document/binary, and regardless of the locale of the build >> environment. > > Ok, I think I understand. Please check my understanding, though: I > interpret your perspective as matching the ideal-world scenario that > John outlined, where the SOURCE_DATE_EPOCH value has no effect at all > on the output of the build
Yes, ideally SOURCE_DATE_EPOCH does not matter. It is a workaround to embed a (hopefully meaningful) timestamp, when from a reproducible builds perspective, ideally there would be no timestamp at all in the resulting artifacts. SOURCE_DATE_EPOCH is a tolerable compromise when leaving out timestamps entirely is either too difficult to achieve (technically, politically, emotionally, logistically ...). > Until then, I see both the build-time (SOURCE_DATE_EPOCH) and > build-locale as inputs that do affect the output of software build > systems, and believe that relevant guidance could help projects > migrate towards reproducibility. I would say a build should be reproducible regardless of the build environment locale. If you want to generate, say, README.fr.txt, the build process translating that from README.txt should force the locale to use to generate that document (e.g. LC_ALL=fr_FR.UTF-8), ignoring the locale of the host system (e.g. C.UTF-8) and the locale of the user logged into that system (e.g. es_ES.UTF-8); in this case, the locale of the build environment should be made irrelevent by whatever build process is used. Maybe the build logs respect the user or system locale in some ways, but the resulting build artifact (e.g. README.fr.txt) should be immune to the system and user locale settings. >> While there almost certainly might be more than one legitimate >> translation for a given work, your process for rendering it should >> really only have one particular output given a particular input >> (e.g. the source language input and the descriptions of how to translate >> it to the desired language)... barring, of course, bugs in the system >> ... or am i missing something entirely? > > No, I don't think you missed anything, and I think we have the same > understanding of the components. We're likely arriving from different > perspectives on the problem space. > > My question is approximately this: for some source software developed > in a natural language that I don't read or understand, and that > includes statically-built documentation (say, HTML files for example), > could I determine that the distributed software (an installer file > downloaded from the web, for example) recommended to me because it > includes support for a natural language that I _do_ understand is > identical to the one in the developers' own natural language? You have confused me here... Two different languages are impossible to be bit-for-bit identical... at least, in the vast majority of cases for any significantly large content; sometimes individual words or even short phrases may be identical between two similar languages. So no, I do not thing it correct or possible to say it is identical; reproducible builds does not help with confirming the accuracy of the meaning of the translation. > (and I think that yes, it's possible: build the source to include the > content from all available languages, and distribute that single copy; > the translations may be better or worse in some areas, but we can all > agree that it is not only the same source, but the same build of that > source) Yes, given the same input files, translation files, etc. it should produce a bit-for-bit identical reproducible result; that is what reproducible builds can promise! Making a strong connection between a built artifact and the source from which it was built. live well, vagrant
signature.asc
Description: PGP signature
