Hello, On Wed, 17 Feb 2021 18:53:46 -0600 Skip Montanaro <skip.montan...@gmail.com> wrote:
> > If we can get a clean copy of the original sources I think we > > should put them up under the Python org on GitHub for posterity. > > Did that earlier today: > > https://github.com/python/pythondotorg/issues/1734 I think to resolve this issue to the completion, and avoid possibility of an intermediary to add any unexpected changes/mistakes to the original sources, instead of "someone making a tarball", someone should make a script, which reproduces making a tarball. Then such a script can be reviewed and tarball reproduced independently (e.g., by the admins of python.org). That's exactly what I did, and attached it to the ticket above: https://github.com/python/pythondotorg/issues/1734#issuecomment-781129337 For extra details, copying my comment there: --- I attach my version of such a script (and also paste it below for reference, but if you use it, please use the attached version to avoid any discrepancies due to copy-paste). The script takes care to preserve not just data, but the metadata of the release, by setting file timestamps to the date/time of the message which contained the 1st chunk of the shar archive. It also takes care to create reproducible tarball, i.e. tarball archives created by different runs of the script should byte-to-byte match each other (cf. https://en.wikipedia.org/wiki/Reproducible_builds). Of course, that depends on .tar and .gz formats themselves being stable (which should be de-facto the case, and I hope their maintainers treat them as such). As an extra measure, MD5SUMS of the individual files is also computed and included in the tarball. Finally, the script itself is also included, as a kind of executable documentation. That's why it's important the script itself to be byte-perfect when recreating the tarball. I also didn't make it executable, it should be run as sh python-0.9.1-create-tarball.sh. Under conditions described above, the tarball produced should have following md5sum: 65e0c4140583c7032f35036939cf1bdd python-0.9.1.tar.gz https://github.com/python/pythondotorg/files/6001019/python-0.9.1-create-tarball.sh.gz The script contents for reference (do not copy-paste, use attached version above): #!/bin/sh # # This script creates fully reproducible, bytes-perfect tarball of the # CPython 0.9.1 release (initial public release) as posted by Guido # van Rossum to the Usenet "alt.sources" newsgroup # (https://en.wikipedia.org/wiki/Usenet_newsgroup). This is not first # attempt to recover the original 0.9.1 sources, but many previous # attempts started from the Dejanews Usenet archives, later acquired # by Google, which have whitespace issues (tabs converted to spaces). # This script uses alternative archive source at ftp.fi.netbsd.org, # which doesn't have whitespace issues. # # This script strives to produce fully reproducible archive, and for # this explicitly sets GMT date of all files included in the archive. # So, for as long as TAR and GZIP formats are themselves stable across # systems, this script should produce bytes-exact archive files on any # system. # set -e # Index: http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/index.gz cat >urls <<EOF http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910220.10.gz#Python 0.9.1 part 01/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910220.11.gz#Python 0.9.1 part 03/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.31.gz#Python 0.9.1 part 04/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.32.gz#Python 0.9.1 part 05/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.33.gz#Python 0.9.1 part 06/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.34.gz#Python 0.9.1 part 07/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.35.gz#Python 0.9.1 part 08/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.36.gz#Python 0.9.1 part 09/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.37.gz#Python 0.9.1 part 10/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.38.gz#Python 0.9.1 part 11/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.39.gz#Python 0.9.1 part 12/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.40.gz#Python 0.9.1 part 13/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.41.gz#Python 0.9.1 part 14/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.42.gz#Python 0.9.1 part 15/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.43.gz#Python 0.9.1 part 16/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.44.gz#Python 0.9.1 part 17/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.45.gz#Python 0.9.1 part 19/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.46.gz#Python 0.9.1 part 21/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.47.gz#Python 0.9.1 part 02/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.48.gz#Python 0.9.1 part 18/21 <2...@charon.cwi.nl> http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/910224.49.gz#Python 0.9.1 part 20/21 <2...@charon.cwi.nl> EOF wget -i urls gzip -d -f *.gz rm -rf python-0.9.1 mkdir -p python-0.9.1 unshar -d python-0.9.1 [0-9]*.[0-9][0-9] find python-0.9.1 -type f | xargs md5sum >MD5SUMS # Set the modtime based on the date of the "part 01/21" message. find python-0.9.1/ | xargs touch -d "19 Feb 1991 17:35:26 GMT" # Set the date of the script itself (and MD5SUMS), to make 100% reproducible # tarball. Use +30 years date. In reality, script was written a couple of days # earlier. touch -d "19 Feb 2021 17:35:26 GMT" python-0.9.1-create-tarball.sh MD5SUMS # Create tarball, include this script itself as a documentation/reference. tar cfz python-0.9.1.tar.gz python-0.9.1-create-tarball.sh MD5SUMS python-0.9.1/ touch -d "19 Feb 1991 17:35:26 GMT" python-0.9.1.tar.gz md5sum python-0.9.1.tar.gz --- -- Best regards, Paul mailto:pmis...@gmail.com _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YPDZE4GX5C3BDKS3EUBJZ3Y35TGYY7NF/ Code of Conduct: http://python.org/psf/codeofconduct/