On 30 September 2017 at 06:02, Thomas Kluyver <tho...@kluyver.me.uk> wrote: > On Fri, Sep 29, 2017, at 07:16 PM, Matthias Bussonnier wrote: >> Second; is there a convention to store the SDE value ? I don't seem to >> be able to find one. It is nice to have reproducible build; but if >> it's a pain for reproducers to find the SDE value that highly decrease >> the value of SDE build. > > Does it make sense to add a new optional metadata field to store the > value of SOURCE_DATE_EPOCH if it's set when a distribution is built? I > guess it could cause problems if unpacking & repacking a tarball means > that its metadata is no longer accurate, though.
For distro level reproducible build purposes, we typically treat the published tarball *as* the original sources, and don't really worry about the question of "Can we reproduce that tarball, from that VCS tree?". This stems from the original model of open source distribution, where publication *was* a matter of putting a tarball up on a website somewhere, and it was an open question as to whether or not the publisher was even using a version control system at all (timeline: RCS=1982, CVS=1986, SVN=2000, git/hg=2005, with Linux distributions getting their start in the early-to-mid 1990's). So SOURCE_DATE_EPOCH gets applied *after* unpacking the original tarball, rather than being used to *create* the tarball (we already know when the publisher created it, since that's part of the tarball metadata). Python's sdists mess with that assumption a bit, since it's fairly common to include generated C files that aren't part of the original source tree, and Cython explicitly recommends doing so in order to avoid requiring Cython as a build time dependency: http://docs.cython.org/en/latest/src/reference/compilation.html#distributing-cython-modules So in many ways, this isn't the problem that SOURCE_DATE_EPOCH on its own is designed to solve - instead, it's asking the question of "How do I handle the case where my nominal source archive is itself a built artifact?", which means you not only need to record source timestamps of the original inputs you used to build the artifact (which the version control system will give you), you also need to record details of the build tools used (e.g. using a different version of Cython will generate different code, and hence different "source" archives), and decide what to do with any timestamps on the *output* artifacts you generate (e.g. you may decide to force them to match the commit date from the VCS). So saying "SOURCE_DATE_EPOCH will be set to the VCS commit date when creating an sdist" would be a reasonable thing for an sdist creation tool to decide to do, and combined with something like `Pipfile.lock` in `pipenv`, or a `dev-requirements.txt` with fully pinned versions, *would* go a long way towards giving you reproducible sdist archives. However, it's not a problem to be solved by adding anything to the produced sdist: it's a property of the publishing tools that create sdists to aim to ensure that given the same inputs, on a different machine, at a different time, you will nevertheless still get the same result. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig