Hi! A bit of context: we already had automatic SWH fallback for Git checkouts, which is to say that any origin that uses ‘git-fetch’ would have its checkout transparently fetched from SWH if upstream vanished (this dates back to commit 608d3dca89d73fe7260e97a284a8aeea756a3e11, Nov. 2018).
What this patch series provides is SWH fallback for full Git clones (as opposed to flat checkouts). It works for anything that uses (guix git). That includes <git-checkout>, used by transformation options: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=1eabc563ca5692b3e08d84f1f0e6fd2283284469 -n updating checkout of 'http://example.org/sdf'... SWH: found revision 1eabc563ca5692b3e08d84f1f0e6fd2283284469 with directory at 'https://archive.softwareheritage.org/api/1/directory/ad8976564375ee55f645387bbcdf4b66e6582fbf/' swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/HEAD swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/branches/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/config swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/description swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/applypatch-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/commit-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/fsmonitor-watchman.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/post-update.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-applypatch.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-commit.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-push.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-rebase.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/pre-receive.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/prepare-commit-msg.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/hooks/update.sample swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/exclude swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/info/refs swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/info/packs swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.idx swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/objects/pack/pack-ed28f44a2599fe2d0a5f1b1a84c247c43afd14a1.pack swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/ swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/heads/master swh:1:rev:1eabc563ca5692b3e08d84f1f0e6fd2283284469.git/refs/tags/ retrieved commit 1eabc563ca5692b3e08d84f1f0e6fd2283284469 substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% substitute: updating substitutes from 'https://bayfront.guix.gnu.org'... 100.0% The following derivation would be built: /gnu/store/39kzsy5kgj5150q6zgckc2hbxp999adw-footswitch-git.1eabc56.drv --8<---------------cut here---------------end--------------->8--- In the example above, we pass a bogus Git URL, but since the target commit is known, (guix git) automatically fetches a bare Git repository from the SWH vault. It also works for channels, which is what zimoun reported here: --8<---------------cut here---------------start------------->8--- $ cat /tmp/chan.scm (list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (commit "f91ae9425bb385b60396a544afe27933896b8fa3") (introduction (make-channel-introduction "9edb3f66fd807b096b48283debdcddccfea34bad" (openpgp-fingerprint "BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))) (channel (name 'guix-past) (url "https://does-not-exist.inria.fr/guix-hpc/guix-past") (commit "77e183dc7ade307ad3409fad4b71f12e266de910") #;(introduction (make-channel-introduction "0c119db2ea86a389769f4d2b9c6f5c41c027e336" (openpgp-fingerprint "3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5"))))) $ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describe Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'... SWH: found revision 77e183dc7ade307ad3409fad4b71f12e266de910 with directory at 'https://archive.softwareheritage.org/api/1/directory/7c6aa10e1e0fa54199566145c6a453731872b87d/' swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/HEAD swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/branches/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/config swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/description swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/hooks/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/exclude swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/info/refs swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/info/packs swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.idx swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/objects/pack/pack-e6c0a4813509178eed735708dd60503353a50b9c.pack swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/ swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/heads/master swh:1:rev:77e183dc7ade307ad3409fad4b71f12e266de910.git/refs/tags/ Computing Guix derivation for 'x86_64-linux'... \ C-c C-c --8<---------------cut here---------------end--------------->8--- Here, the ‘guix-past’ channel is transparently cloned from SWH. This is pretty cool, because having the whole repo around is what permits things like downgrade prevention¹ and news support². Finally we can enjoy content-addressability and brittle URLs are becoming a thing of the past!* Limitations ~~~~~~~~~~~~ Yes, there’s a couple of them. First, fallback is implemented only for fresh clones, not for updates. Thus, if I rerun the first example, having now the clone in ~/.cache/guix/checkouts, with a different commit, I get: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build footswitch --with-git-url=footswitch=http://example.org/sdf --with-commit=footswitch=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa -n updating checkout of 'http://example.org/sdf'... guix build: error: Git failure while fetching http://example.org/sdf: unexpected http status code: 404 --8<---------------cut here---------------end--------------->8--- Second, clones from SWH only contain the one branch that the revision is on. For channels, that means that the ‘keyring’ branch is not fetched, which is why I commented out ‘introduction’ in /tmp/chan.scm above. If I uncomment it, I get: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix time-machine -C /tmp/chan.scm -- describe Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... Updating channel 'guix-past' from Git repository at 'https://does-not-exist.inria.fr/guix-hpc/guix-past'... guix time-machine: error: Git error: cannot locate remote-tracking branch 'origin/keyring' --8<---------------cut here---------------end--------------->8--- The SWH folks tell me it’ll eventually be possible to map a revision to its containing snapshot(s) via the HTTP API, and to obtain entire snapshots (i.e., the repo and all its branches) from the vault. That’s what we need to fix this issue. *Third, and this answers the asterisk above, we must keep in mind that this is content-addressibility *with SHA1*. Generating a chosen-prefix collision is becoming affordable³, so users absolutely need an additional mechanism to authenticate code they fetched. For origins, we have the content SHA256, so we’re fine. For channels, we have Guix’s authentication mechanism¹, except it’s not available yet via SWH, as I wrote above. For the footswitch example above using ‘--with-commit’, we don’t have any authentication method, but in fact, that’s the situation of Git repositories in general: they can rarely be authenticated. Overall, I think it’s a step in the right direction. Thoughts? Thanks to vlorentz and olasd on #swh-devel for their support! Thanks, Ludo’. ¹ https://guix.gnu.org/en/blog/2020/securing-updates/ ² https://guix.gnu.org/en/blog/2019/spreading-the-news/ ³ https://sha-mbles.github.io/ Ludovic Courtès (3): swh: Support downloads of bare Git repositories. git: 'update-cached-checkout' can fall back to SWH when cloning. git: 'reference-available?' recognizes 'tag-or-commit'. guix/git.scm | 45 +++++++++++++++++++++++++++++++++++++++++++-- guix/swh.scm | 52 ++++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 83 insertions(+), 14 deletions(-) -- 2.33.0