Hi all, On Fri, 31 Dec 2021 at 10:31, Ricardo Wurmus <rek...@elephly.net> wrote:
> I have no strong feelings for or against any of the proposed options. I > think that using raw commits might not be great for our tooling because > we’re not reusing an existing version string and would need to remember > to update the raw commit as well. But other than that I don’t find the > raw commit to introduce readability problems for humans. By tooling, Ricardo, do you mean the ’importers’ and other ’updaters’? Well, a general minor comment about readability and metadata. The anatomy of a package is: --8<---------------cut here---------------start------------->8--- (define-public a-symbol (package (name "a-name") (version "1.2.3") (source (origin (method git-fetch) (uri (git-reference (url "https://an-url.somewhere") (commit ????))) (file-name (git-file-name name version)) (sha256 (base32 "09rdbcr8dinzijyx9h940ann91yjlbg0fangx365llhvy354n840")))) (build-system gnu-build-system) (home-page "https://another-url.somewhere) (synopsis "Guile extension for numerical arrays and tensors") (description "AIscm is a Guile extension for numerical arrays and tensors. Performance is achieved by using the LLVM JIT compiler.") (license license:gpl3+))) --8<---------------cut here---------------end--------------->8--- and here, a-symbol, a-name and various home-page, synopsis, description are Guix specific. They are metadata added by Guix packagers. Version is also Guix specific. Sometimes, we patch; for security reasons, for fixing a bug, for quickly backporting something, for removing non-free bits, for unbundling stuff, for making work with the rest of Guix packages or for whatever other reasons – or we apply some options for building specifically for Guix. Then, the version “1.2.3” is not always changed and therefore it does not necessary correspond to what upstream refers as “1.2.3”, or what Debian calls “1.2.3”, etc. The field ’version’ is Guix specific, at the same level of metadata as ’name’, ’home-page’, ’synopsis’ or ’description’. Other said, these fields only depend on choices made by the Guix packagers. Then, the ’origin’ part is not Guix specific. It is only upstream specific. Obviously, as packages distributor, the Guix specific ’version’ matches as much as possible with what upstream refers as their version, most of the time using the Git feature of tag. This tag is upstream specific: sometimes is “v1.2.3”, sometimes “1.2.3”, sometimes “release-1.2.3”, sometimes “r1.2.3”, or whatever else. We often map ’version’ to ’tag’ using ’string-append’. For other methods that git-fetch, we also use a map, but instead, from ’version’ to URL, or from ’version’ to ’changeset’, or from ’version’ to ’revision’, etc. On a side note, I miss why using commit hash is an issue for ’git-fetch’ – despite the fact of content-address advantages – when it seems not for ’svn-fetch’ as in: --8<---------------cut here---------------start------------->8--- (version "0.5.1") (source (origin (method svn-fetch) (uri (svn-reference (url (string-append "https://code.call-cc.org/svn/chicken-eggs/" "release/5/srfi-1/tags/" version)) (revision 39055) (user-name "anonymous") (password ""))) (file-name (string-append "chicken-srfi-1" version "-checkout")) (sha256 (base32 "02940zsjrmn7c34rnp1rllm2nahh9jvszlzrw8ak4pf31q09cmq1")))) --8<---------------cut here---------------end--------------->8--- or other example --8<---------------cut here---------------start------------->8--- (let ((revision 505) (release "1.09.01")) (package (name "fullswof-2d") (version release) (source (origin (method svn-fetch) (uri (svn-reference (url (string-append "https://subversion.renater.fr/" "anonscm/svn/fullswof-2d/tags/" "release-" version)) (revision revision))) (file-name (string-append "fullswof-2d-" version "-checkout")) (sha256 (base32 "16v08dx7h7n4wyddzbwimazwyj74ynis12mpjfkay4243npy44b8")))) --8<---------------cut here---------------end--------------->8--- I let aside the readability point for git-fetch or any others since it is only habits or more precisely collective conventions and a bit of personal preferences. :-). When we speak about robustness and long-term, the issue is the field ’uri’. Having something extrinsic, i.e., which does not depend on the content, as URL+tag or URL+revision or just URL leads to fragile fetching methods depending on the Moon phase. What Disarchive is currently doing for url-fetch is somehow to index by integrity field, depending only on the content itself (sha256; usually not using nix-base32 format referred as ’base32’ in ’origin’ but instead ’base16’ format, whatever). In short and quickly said, Disarchive-DB does 2 things more or less, first it somehow maps from this integrity hash to swhid hash allowing to lookup in SWH archive and fetches the data, and second it stores metadata, indexed by integrity field, allowing to reassemble the content = data + metadata. We were discussing to do this strategy for all the fetching methods. And potentially add more than swhid hash as content-address systems; somehow. All the robustness now relies on the availability of the Disarchive service. Based on this context, what I miss in all the discussion is that Git owns a built-in solution (commit hash) and the arguments for not using it appears to me weak considering the easy advantage it brings. It is a difficult topic to know what information the ’uri’ field should contain for robust long-term; a topic with a lot of unknowns, although many solutions are around, they are a strong change of habits and changing my own habits is already hard, so a collective change is a big collective challenge. :-) For instance, SWH promotes swhid instead of DOI for referencing the publications. I am not sure it is really popular outside a small French subgroup. ;-) Somehow, find some rationale –readability, matching versions, etc.– and then find counter-measures of their flaws to keep extrinsic values –tag, revision, etc.– is, for what my opinion is worth, not the correct level or frame when thinking about robustness and long-term. Cheers, simon