Hi Guix, when Ricardo recently added guile-aiscm to Guix, I was confused that both the version field of the package and the commit field of the git- reference used in its origin. It turns out, that this is a rare pattern observed in less than 200 packages currently in Guix. The reason to do so (as far as I understand and was explained to me in IRC) is that commit tags are in principle mutable and hence can not be relied on when fetching sources. I do have a few issues with that explanation, but before that let's go a step back and discuss the relation of version and commit.
Consider a package being added or updated in Guix. At the time of commit, we have the tag v1.2.3 pointing towards commit deadbeef. We therefore create a guix package with version "1.2.3" pointing to said commit (either directly or indirectly). At this point, one of the following holds: (1) Guix "1.2.3" -> upstream "v1.2.3" -> upstream "deadbeef" (2) Guix "1.2.3" -> upstream "deadbeef" <- upstream "v1.2.3" >From either, we can follow that Guix "1.2.3" = upstream "v1.2.3". If upstream keeps their tags around, then both forms are equivalent, but (1) is more convenient; it allows us to derive commit from version, which is often done through an affine mapping. Problems arise, when upstreams move or delete tags. At this point, guix packages that use them break and are no longer able to fetch their source code. Raw commits are in principle resilient to this kind of denial of service; instead upstreams would have to actually delete the commits themselves, including also possible backups such as SWH to break it. There is certainly an argument for robustness to be made here, particularly concerning `guix time-machine', though as noted it is not infallible. It should be noted, that in the case of moving or deleted tags, the assertion Guix "1.2.3" = upstream "v1.2.3" no longer holds. Widespread use of this pattern under the above reasoning would imply that those upstreams can't be trusted to have stable tags when there are probably few offenders in that category (considering also that Guix is not the only tool they'd break if they do move or delete tags). More importantly, if we do have a non-trustworthy upstream, it could be reasoned that referring to some tag is as good as referring to a random commit and thereby let-bound commits and revisions ought to be used. As any good Sith would, the above talks in absolutes, or at the very least uses default logic without considerable fallbacks. On the note of fallbacks, we do also have the issue that Guix fails on the first download that does not match the hash instead of e.g. continuing to SWH to fetch an archive of the old tag (as well as other fallback-related issues, also including the "Tricking Peer Review" thread). Putting those aside for a while, there is an all but endless amount of upstreams for which we can't tell ahead of time whether they will act nicely or not. The status quo for most of our packages is to assume that they do and fail loudly if they don't. The proposed alternative is to assume they don't and miss out on nice things if they do. However, even under that assumption we also miss out on ninja version bumps and the only way of noticing other than paranoid amounts of checking whether the tag moved would be to wait for a mail from upstream claiming that they actually wanted us to notice the ninja bump. Neither of the above is really satisfactory. At the very least, if raw strings are to be used in the commit fields for tags that "once existed, but maybe no longer point to that commit", I'd want a comment like the ones I find in minetest.scm to mentally prepare me for what I'm about to read in the rest of the package description, but I'd much prefer using let-bound commit/revision pairs. Perhaps we could make revision "0" (alternatively #f if we don't want current versions to break) special in that a git-version with it expands to just version. Long-term, we might want to support having multiple <git-references> in git-fetch -- if the first one fails due to a hash mismatch, we would warn about that instead of producing an error and thereafter continue with the second, third, etc. similar to how we currently have mirror:// urls for some well-known mirrored repositories. That way, we have a system to warn us about naughty upstreams while also providing robustness for the time machine. What do y'all think?