fosslinux via rb-general <[email protected]> wrote: > Absolutely agreed! So let's have a definition that clearly defines top-down > work as progress toward reproducibility. :D
I am happy to see all sorts of progress toward reproducible distributions, whether it involves compiling from source code, or otherwise. Perhaps the definition of a "pure function" from mathematical computer science can help: https://en.wikipedia.org/wiki/Pure_function I think that we want most or all of the tools used to build a release to be pure functions, in the sense that when run with the same inputs, they produce the same outputs. This property is independent of whether the tool inputs are "source code" or not. So Roland Clobus's improved tool for building a bootable ISO from a collection of files could be "pure", even if its input is not source code. (If the tool pulls its collection of files from an uncontrolled source out on the network, then it can't be pure, since somebody else could change those files elsewhere, causing the tool's output to change.) An stricter property might be that a program is "portably pure" when it produces the same outputs from the same inputs, despite being run on a variety of processor types, operating systems, etc. Many of the standard tools on GNU and UNIX systems are designed to be portably pure, and some achieve that. Note that reproducible distributions can use tools that aren't portably pure, since we only require the tool to work purely, within the environment of the distribution itself. E.g. the output could depend on the word-size of the system it's running on (e.g. 32-bit or 64-bit), and still the distribution could be reproducible. You may not be able to cross-build the distribution from some disparate host system, but you could reproducibly rebuild it on itself. (The Wikipedia definition uses the C- or C++-language definition of "function", but we can generalize that to the properties of an entire executable program that one might run from a shell. One issue with executable programs that read files in ordinary UNIX file systems is that they change the time-last-accessed of the files as a side effect. Their execution also leaves log records in accounting files and such. In order to be considered a pure function, within the context of a process such as the build of a distribution, those side effects "must not matter" to the overall process that we are contemplating. E.g. if those files are later copied by "tar" or "genisoimage" to make a release, then the build scripts must take care that the access-time is not copied into the output file.) John
