On Sat, Oct 31, 2020 at 2:31 AM Jeremy Stanley wrote: > I have to agree, though in the upstream projects with which I'm > involved, those generated files are basically a lossy re-encoding of > metadata from the Git repositories themselves: AUTHORS file > generated from committer headers, ChangeLog files from commit > subjects, version information from tag names, and so on. Some of > this information may be referenced from copyright licenses, so it's > important in those cases for package maintainers to generate it when > making their source packages if not using the sdist tarballs > published by the project.
As the maintainer of the autorevision package (which aims at dumping a cache of VCS version metadata, for when exporting a tarball from a VCS), I've been thinking about this for a while now. I've been thinking about modifying automake (and other build tools) to have a mode that basically does `git archive` instead of just calling tar and also creates separate tarballs for all the other generated or embedded files. One for the autorevision metadata, one for the autotools cruft, one for a cache of the data needed for AUTHORS/ChangeLog/NEWS etc and possibly other ones. Then Debian can use our multi-component quilt 3.0 format to take the git archive, the autorevision metadata, and the cache of the VCS data, but leave the autotools cruft behind. Then we can audit the git repo for generated files and embedded data/code copies and when there are none, we can confidently build the configure script from source knowing that everything required is available in the build-deps. The same could apply to projects uploading to NPM or PyPi, although I'm not sure if those support this sort of thing though. -- bye, pabs https://wiki.debian.org/PaulWise