Re: Concerns/questions around Software Heritage Archive
Hello Guixers, It’s been another week with no response or movement on this. I’m disappointed that this situation seems to be getting treated so lightly. Adhering to the terms of software licenses is fundamental to the operation of the free software ecosystem; there is no software freedom without it. It’s surprising that a pretty clear-cut situation of creating derivative works of free software in violation of their licenses would be shrugged off so easily. Whatever the Guix organization’s position is, I’m reaching my personal limit, and need to see some kind of positive movement on this[1]. If Guix is going to continue to facilitate license violations, I will have no choice but to remove my software from it to defend them. — Ian [1]: Personally, I would be satisfied with a per-package setting which disables scheduling source for archiving by SWH. Seeing this, or a committment to build this within a reasonable timeframe, would allay my concerns. Ian Eure writes: Hello, I’m following up on this since discussion since it’s been a month and I haven’t heard any updates. Summarizing the situation: - SHF has an opaque, difficult, and undocumented process for handling name changes. I’s like to stress again that this is *not* strictly a transgender issue (though it likely affects them more, or in worse/different ways) -- it is a human respect issue. Many, many more cisgender people change their name than transgender people. - SHF gave their archive to HuggingFace, an "AI" company which is generating derived works with no attribution or provenance, in ways which violate the both licenses of the projects used to train their model, and the SHF principles for LLMs. - HuggingFace wasn’t respecting requests to opt-out of their model. On the first point, it sounds like SHF has made concrete progress to improve[1], which is very good to hear. If SHF continues on this course, I think the concern is resolved. On the third point, HuggingFace has begun honoring opt-out requests, but is still very far behind. Also, they don’t remove code from the older versions of their model -- it remains there forever. This is progress, but still, not great. On the second point, I have not seen any public statements indicating that either SHF or HuggingFace even acknowledges the problem. SHF’s most recent newsletter[2], published in April 2024 (after these concerns came to light), continues to tout that StarCoder2 is "the first AI model aligned with our principles," which appears to be false. StarCoder2 includes both licensed and unlicensed code, and HuggingFace’s own StarChat2 playground produces works derivative of this code, with no attribution or licensing information. There is also no statement or position on the SHF news blog. Nor hsa HuggingFace either fixed their tools, or made a statement. This is still very much a live concern. I have a few questions: - Has Guix reached out to SHF to express these concerns / get a response? - Whether a public or private response, what would Guix consider to be an acceptable response? An unacceptable respoinse? - How long is Guix willing to wait for a response? Thanks, — Ian [1]: https://cohost.org/arborelia/post/5273879-they-are-fixing-some [2]: https://www.softwareheritage.org/wp-content/uploads/2024/04/Software-Heritage-2024-Vision-Milestones-Newsletter.pdf Ian Eure writes: Hi Guixy people, I’d never heard of SWH before I started hacking on Guix last fall, and it struck me as rather a good idea. However, I’ve seen some things lately which have soured me on them. They appear to be using the archive to build LLMs: https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/ I was also distressed to see how poorly they treated a developer who wished to update their name: https://cohost.org/arborelia/post/4968198-the-software-heritag https://cohost.org/arborelia/post/5052044-the-software-heritag GPL’d software I’ve created has been packaged for Guix, which I assume means it’s been included in SWH. While I’m dealing with their (IMO: unethical) opt-out process, I likely also need to stop new copies from being uploaded again in the future. Is there a way to indicate, in a Guix package, that it should *never* be included in SWH? Is there a way to tell Guix to never download source from SWH? I want absolutely nothing to do with them. Thanks, — Ian
Re: bug#67512: [PATCH v7 0/3] Add LibreWolf
Ian Eure writes: Clément Lassieur writes: On Fri, Apr 12 2024, Andrew Tropin via Guix-patches via wrote: On 2024-04-06 08:04, Ian Eure wrote: Moves nss update to nss-3.98 / nss-certs-3.98 to avoid rebuilding thousands of packages. Rebases. Ian Eure (3): gnu: Add nss-3.98. gnu: Add nss-certs-3.98. gnu: Add librewolf. gnu/packages/certs.scm | 16 + gnu/packages/librewolf.scm | 621 + gnu/packages/nss.scm | 45 +++ 3 files changed, 682 insertions(+) create mode 100644 gnu/packages/librewolf.scm base-commit: ade6845da6cec99f3bca46faac9b2bad6877817e Hi Ian, tested those patches, didn't notice any issues. Added pipewire to LD_LIBRARY_PATH to make screensharing on wayland to work. Added librewolf.scm to gnu/local.mk. Pushed as https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3dc26b4eae Thank you very much for you work! Thank you Andrew for reviewing. Now that this is pushed, is there anyone maintaining this "librewolf" package? This is serious work, with security updates quite often. Hi Clement, I’m planning to continue sending patches for updates and the like. Getting a working updater is close to the top of my list. Right now the package is subject to CVE-2024-3852 (high) CVE-2024-3853 (high) CVE-2024-3854 (high) CVE-2024-3855 (high) CVE-2024-3856 (high) CVE-2024-3857 (high) CVE-2024-3858 (high) CVE-2024-3859 (moderate) CVE-2024-3860 (moderate) CVE-2024-3861 (moderate) CVE-2024-3862 (moderate) CVE-2024-3302 (low) CVE-2024-3864 (high) CVE-2024-3865 (high) The version in Guix is the latest available. I’ll send in a patch when the next release happens; I’m waiting on upstream for that. Okay, I see that I’m incorrect about this -- LibreWolf is moving onto Codeberg, but I was looking at their GitLab project, which doesn’t have the recent releases. I’ll get this updated. Thanks, — Ian
Re: bug#67512: [PATCH v7 0/3] Add LibreWolf
Clément Lassieur writes: On Fri, Apr 12 2024, Andrew Tropin via Guix-patches via wrote: On 2024-04-06 08:04, Ian Eure wrote: Moves nss update to nss-3.98 / nss-certs-3.98 to avoid rebuilding thousands of packages. Rebases. Ian Eure (3): gnu: Add nss-3.98. gnu: Add nss-certs-3.98. gnu: Add librewolf. gnu/packages/certs.scm | 16 + gnu/packages/librewolf.scm | 621 + gnu/packages/nss.scm | 45 +++ 3 files changed, 682 insertions(+) create mode 100644 gnu/packages/librewolf.scm base-commit: ade6845da6cec99f3bca46faac9b2bad6877817e Hi Ian, tested those patches, didn't notice any issues. Added pipewire to LD_LIBRARY_PATH to make screensharing on wayland to work. Added librewolf.scm to gnu/local.mk. Pushed as https://git.savannah.gnu.org/cgit/guix.git/commit/?id=3dc26b4eae Thank you very much for you work! Thank you Andrew for reviewing. Now that this is pushed, is there anyone maintaining this "librewolf" package? This is serious work, with security updates quite often. Hi Clement, I’m planning to continue sending patches for updates and the like. Getting a working updater is close to the top of my list. Right now the package is subject to CVE-2024-3852 (high) CVE-2024-3853 (high) CVE-2024-3854 (high) CVE-2024-3855 (high) CVE-2024-3856 (high) CVE-2024-3857 (high) CVE-2024-3858 (high) CVE-2024-3859 (moderate) CVE-2024-3860 (moderate) CVE-2024-3861 (moderate) CVE-2024-3862 (moderate) CVE-2024-3302 (low) CVE-2024-3864 (high) CVE-2024-3865 (high) The version in Guix is the latest available. I’ll send in a patch when the next release happens; I’m waiting on upstream for that. Thanks, — Ian
Re: Fallout from recent nss-certs changes
No, this is not a bug. specification->package always returns the latest version of a package and has no way of knowing what variable(s) that package object is bound to. On April 21, 2024 8:02:50 AM PDT, Felix Lechner wrote: >Hi, > >On Sat, Apr 20 2024, Ian Eure wrote: > >> If an operating-system’s packages includes `(specification->package >> "nss-certs")', this causes breakage, because that form selects version >> 3.98, but %base-packages includes 3.88.1, which causes an error on the >> next `guix system reconfigure' due to conflicting package versions in >> the profile. > >Why does the unversioned stringy selector (specification->package >"nss-certs") resolve to a version different from the unversioned >variable nss-certs? Is that a bug? > >Kind regards >Felix > >P.S. I hoped to use the word "reified" but did not know how it fit in. Thanks, — Ian
Re: Fallout from recent nss-certs changes
The change is mentioned in the channel news, but it says nothing about needing to remove that part of the config. On April 21, 2024 1:32:38 AM PDT, "pelzflorian (Florian Pelz)" wrote: >Hello Ian. My understanding of the nss-certs etc/news.scm item had been >that we should remove (specification->package "nss-certs"), which became >unnecessary and clutters config.scm. From what you write, this was >actually not intended, but it is still not a bug IMHO. > >(I’m not involved with the change, though.) > >Regards, >Florian Thanks, — Ian
Re: Concerns/questions around Software Heritage Archive
Hello, I’m following up on this since discussion since it’s been a month and I haven’t heard any updates. Summarizing the situation: - SHF has an opaque, difficult, and undocumented process for handling name changes. I’s like to stress again that this is *not* strictly a transgender issue (though it likely affects them more, or in worse/different ways) -- it is a human respect issue. Many, many more cisgender people change their name than transgender people. - SHF gave their archive to HuggingFace, an "AI" company which is generating derived works with no attribution or provenance, in ways which violate the both licenses of the projects used to train their model, and the SHF principles for LLMs. - HuggingFace wasn’t respecting requests to opt-out of their model. On the first point, it sounds like SHF has made concrete progress to improve[1], which is very good to hear. If SHF continues on this course, I think the concern is resolved. On the third point, HuggingFace has begun honoring opt-out requests, but is still very far behind. Also, they don’t remove code from the older versions of their model -- it remains there forever. This is progress, but still, not great. On the second point, I have not seen any public statements indicating that either SHF or HuggingFace even acknowledges the problem. SHF’s most recent newsletter[2], published in April 2024 (after these concerns came to light), continues to tout that StarCoder2 is "the first AI model aligned with our principles," which appears to be false. StarCoder2 includes both licensed and unlicensed code, and HuggingFace’s own StarChat2 playground produces works derivative of this code, with no attribution or licensing information. There is also no statement or position on the SHF news blog. Nor hsa HuggingFace either fixed their tools, or made a statement. This is still very much a live concern. I have a few questions: - Has Guix reached out to SHF to express these concerns / get a response? - Whether a public or private response, what would Guix consider to be an acceptable response? An unacceptable respoinse? - How long is Guix willing to wait for a response? Thanks, — Ian [1]: https://cohost.org/arborelia/post/5273879-they-are-fixing-some [2]: https://www.softwareheritage.org/wp-content/uploads/2024/04/Software-Heritage-2024-Vision-Milestones-Newsletter.pdf Ian Eure writes: Hi Guixy people, I’d never heard of SWH before I started hacking on Guix last fall, and it struck me as rather a good idea. However, I’ve seen some things lately which have soured me on them. They appear to be using the archive to build LLMs: https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/ I was also distressed to see how poorly they treated a developer who wished to update their name: https://cohost.org/arborelia/post/4968198-the-software-heritag https://cohost.org/arborelia/post/5052044-the-software-heritag GPL’d software I’ve created has been packaged for Guix, which I assume means it’s been included in SWH. While I’m dealing with their (IMO: unethical) opt-out process, I likely also need to stop new copies from being uploaded again in the future. Is there a way to indicate, in a Guix package, that it should *never* be included in SWH? Is there a way to tell Guix to never download source from SWH? I want absolutely nothing to do with them. Thanks, — Ian
Fallout from recent nss-certs changes
Some recent nss-certs changes have a negative side effects which needs to be fixed. A patch of mine was pushed recently (commit 0920693381d9f6b7923e69fe00be5de8621ddb6f), which adds nss-certs 3.98 to (gnu packages certs), under the nss-certs-3.98 variable. Then, commit fdfd7667c66cf9ce746330f39bcd366e124460e1 was pushed, which adds nss-certs to %base-packages-networking. This references the nss-certs variable, which is version 3.88.1. If an operating-system’s packages includes `(specification->package "nss-certs")', this causes breakage, because that form selects version 3.98, but %base-packages includes 3.88.1, which causes an error on the next `guix system reconfigure' due to conflicting package versions in the profile. Prior to commit 65e8472a4b6fc6f66871ba0dad518b7d4c63595e, the graphical installer would ask users if they wanted to install nss-certs, and put this form into the operating-system’s packages, so there are likely many users affected -- it bit me, and I’ve seen a couple in IRC as well. I think the options to fix this are: 1. Removing (specification->package "nss-certs") from one’s operating-system. 2. Grafting nss-certs 3.98 onto nss-certs 3.88.1. 3. Replacing nss-certs 3.88.1 with 3.98. The most expedient option is 1, as it can be applied by users -- but there’s probably not a good way to communicate that this needs to happen. There was some talk in IRC about grafting nss/nss-certs, but it looks like this didn’t happen. An upgrade is the best path, but would probably need to happen in core-updates, since this rebuilds a large number of packages. Thoughts on this? Thanks, — Ian
Re: Concerns/questions around Software Heritage Archive
Simon Tournier writes: Hi, On lun., 18 mars 2024 at 12:38, Ian Eure wrote: They appear to be violating free software licenses on large scale. They are in violation of SWH’s own positions. [...] [1]: https://arxiv.org/html/2402.19173v1 [2]: https://huggingface.co/spaces/HuggingFaceH4/starchat2-playground [3]: https://huggingface.co/datasets/bigcode/the-stack-v2 [4]: https://github.com/bigcode-project/opt-out-v2/issues Please note that Software Heritage folks are not co-author of all that; or I misread. Do not take me wrong, this is not an attempt to escape but a query for waiting the feedback of SWH. Shit rolls downhill. It’s the least surprising thing in the world to find that an "AI" company is violating licenses, because the entire technology is based on infringement at a massive scale. SWH’s partnership with, and promotion of, both the company and its license-violating model, in violation of their *own stated principles*, raises very legitimate questions. There are multpile overlapping concerns here; personal, organizational, legal, ethical, and technical. From a personal, legal standpoint, HuggingFace is almost certainly in violation of my code’s licenses. I will, therefore, work to remove my code from their models. From a personal, ethical standpoint, I believe that SWH has proven themselves untrustworthy by enabling *and promoting* this infringement in violation of their own stated policies, and will work to remove my code from their archive. Personally, I cannot extend them the benefit of the doubt on this. They blew it. From an organizational ethical standpoint, Guix is IMO on the right track by waiting on SWH (and perhaps pressuring them to fix things). From an organizational, technical perspective, I would like to see concrete measures to support my (and hundreds of others’) personal, ethical desires to exclude software from SWH, and by extension, HuggingFace’s models. As Ludo said, SWH folks are, by the way, also long time Free Software activists. In my view, this is not to their credit. I’d expect people familiar with Free Software to be *more* sensitive to licensing concerns, thus less likely to partner with a company likely to violate them. PS: Thanks for the detailed explanations. I will provide my reading later, after some concerns will be separated, eventually. You’re very welcome. Thanks, — Ian
Re: Concerns/questions around Software Heritage Archive
Simon Tournier writes: Hi, On sam., 16 mars 2024 at 08:52, Ian Eure wrote: They appear to be using the archive to build LLMs: https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/ About LLM, Software Heritage made a clear statement: https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code Quoting: We feel that the question is no longer whether LLMs for code should be built. They are already being built, independently of what we do, and there is no turning back. The real question is how they should be built and whom they should benefit. Principles: 1. Knowledge derived from the Software Heritage archive must be given back to humanity, rather than monopolized for private gain. The resulting machine learning models must be made available under a suitable open license, together with the documentation and toolings needed to use them. 2. The initial training data extracted from the Software Heritage archive must be fully and precisely identified by, for example, publishing the corresponding SWHID identifiers (note that, in the context of Software Heritage, public availability of the initial training data is a given: anyone can obtain it from the archive). This will enable use cases such as: studying biases (fairness), verifying if a code of interest was present in the training data (transparency), and providing appropriate attribution when generated code bears resemblance to training data (credit), among others. 3. Mechanisms should be established, where possible, for authors to exclude their archived code from the training inputs before model training begins. I hope it clarifies your concerns to some extent. It doesn’t clarify them, but it does illustrate them. HuggingFace and the StarCoder2 model is in violation of principle 2. By their own admission, they are including code without clear licensing[1]: The main difference between the Stack v2 and the Stack v1 is that we include both permissively licensed and unlicensed files. HuggingFace’s StarChat2 Playground[2] also violates this principle, as it outputs code without any license or provenance information; I know, because I tried it. While their own terms of use for StarCoder2 state: Any use of all or part of the code gathered in The Stack v2 must abide by the terms of the original licenses... ...their own playground makes this impossible. HuggingFace is also in violation of the third principle, because they haven’t established a functioning opt-out model[3]. Opting out requires using non-free software; requests have been sitting for nearly a year with no action or response; and out of every request submitted, only a single one has *ever* been honored. They appear to be violating free software licenses on large scale. They are in violation of SWH’s own positions. Moreover, you wrote: « I want absolutely nothing to do with them. » Maybe there is a misunderstanding on your side about what “free software” and GPL means because once “free software”, you cannot prevent people to use “your” free software for any purposes you dislike. If you want to bound the use cases of the software you create, you need to explicitly specify that in the license. And if you do, your software will not be considered as “free software”. That’s the double sword of “free software”. :-) I am crystal clear on the meaning of free software. I wish to remove it from these models *in order to* keep it free. Thanks, — Ian [1]: https://arxiv.org/html/2402.19173v1 [2]: https://huggingface.co/spaces/HuggingFaceH4/starchat2-playground [3]: https://huggingface.co/datasets/bigcode/the-stack-v2 [4]: https://github.com/bigcode-project/opt-out-v2/issues
Re: Concerns/questions around Software Heritage Archive
MSavoritias writes: On 3/17/24 13:53, paul wrote: Hi all , thank you MSavoritias for bringing up points that many of us share. It's clearly a tradeoff what to do about the past. For the future, as Christpher already stated, we need a serious solution that we can uphold as a free software project that does not alienate users or contributors. My opinion is that names are just wrong to be included, not only because of deadnames, but in general having a database with a column first_name and a column second_name is something only a 35 yrs old white cis boy could have thought was a good idea to model the spectrum of names humans use all over the world: https://web.archive.org/web/20240317114846/https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/ If we'd really need to identify contributors, and obviously Guix doesn't, we could use an UUID/machine readable identifier which can then be mapped to a displayed name. I believe git can already be configured to do so. giacomo The uuid sounds like a very interesting solution indeed. I wonder how easy it could be to add it to git. This also seems like interesting territory to explore. The concerns raised around rewriting history have valid points; I think it’s impractical to rewrite history any time a change needs to happen, as that would be an ongoing source of disruption. But rewriting history *once*, to switch to a more general mechanism, seems like a reasonable trade to me. This also presents an opportunity: we could combine this with a default branch switch from master to main. A news entry left as the final commit in master could inform people of whatever steps may be needed to update (if that can’t be automated), and the main branch would contain the rewritten history. It’s certainly not a perfect solution, but it seems pragmatic. — Ian
Re: Concerns/questions around Software Heritage Archive
MSavoritias writes: On 3/17/24 11:39, Lars-Dominik Braun wrote: Hey, I have heard folks in the Guix maintenance sphere claim that we never rewrite git history in Guix, as a matter of policy. I believe we should revisit that policy (is it actually written anywhere?) with an eye towards possible exceptions, and develop a mechanism for securely maintaining continuity of Guix installations after history has been rewritten so that we maintain this as a technical possibility in the future, even if we should choose to use it sparingly. the fallout of rewriting Guix’ git history would be devastating. It would break every single Guix installation, because a) `guix pull` authenticates commits and we might lose our trust anchor if we rewrite history earlier than the introduction of this feature, b) `guix pull` outright rejects changes to the commit history to prevent downgrade attacks. Additionally it would break every single existing usage of the time machine and thereby completely defeat the goal of providing reproducible software environments since the commit hash is used to identify the point in time to jump to. I doubt developing “mechanisms” – whatever they look like – would be worth the effort. Our contributors matter, but so do our users. Never ever rewriting our git history is a tradeoff we should make for our users. Lars Thats a good point. in the sense that its a tradeoff here and I absolutely agree. But let me add some food for thought here: 1. Were the social aspects considered when the system came into place? 2. Is it more important for the system to stay as is than to welcome new contributors? 3. You mention "its a tradeoff we should make for our users". How many trans people where involved in that decision and how much did their opinion matter in this? I am saying this because giving power to people(what is called users) is not only handling them code or make sure everything is free software. Its also the hard part of making sure the voices of people that can not code is heard and is participating and taking in mind. Just want to say that I appreciate and agree with your thoughtful words. I’d also note that name changes aren’t a concern limited to trans people, and framing this as "we have to upend everything Because Transgender" is both wrong and feels pretty bad to me. Anyone can change their name at any time for any reason, or no reason at all, and may wish to update historical references to their previous names. Having a mechanism to support this is, in my view, a matter of basic decency and respect for all humans. Thanks, — Ian
Re: Concerns/questions around Software Heritage Archive
Christopher Baines writes: [[PGP Signed Part:Undecided]] Ian Eure writes: Hi Guixy people, I’d never heard of SWH before I started hacking on Guix last fall, and it struck me as rather a good idea. However, I’ve seen some things lately which have soured me on them. They appear to be using the archive to build LLMs: https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/ I was also distressed to see how poorly they treated a developer who wished to update their name: https://cohost.org/arborelia/post/4968198-the-software-heritag https://cohost.org/arborelia/post/5052044-the-software-heritag GPL’d software I’ve created has been packaged for Guix, which I assume means it’s been included in SWH. While I’m dealing with their (IMO: unethical) opt-out process, I likely also need to stop new copies from being uploaded again in the future. Is there a way to indicate, in a Guix package, that it should *never* be included in SWH? Not currently, and I don't really see the point in such a mechanism. If you really never want them to store your code, then you need to license it accordingly (and not make it free software). I don’t want my code in SWH *because* it’s free. A primary use of LLMs is laundering freely licensed software into proprietary, commercial projects through "AI" code completion and generation. Any Free software in an LLM training set can and will be used in violation of its license, without a clear path for the author to seek recourse. I deleted my code off Github and abandoned it completely for this exact reason, and am deeply irked to be going through this nonsense again. A more salient question may be: Is there a process within Guix (either the program or the organization) which uploads source to SWH? Or does it rely on SWH indepently? If the latter, my problem is likely solved by blocking SWH at my network edge and opting out of their archive (or trying to) and the downstream training models they’ve already put it in. If the former, the only control I currently have to protect my license is removing packages from Guix which contain it. I don’t want that outcome. Noting also that the path here seems to be SWH->huggingface->bigcode training set, and the opt-out process for the training set appears to be a complete sham. To opt-out, you must create a Github Issue; only one opt-out has *ever* been processed, and there are 200+ sitting there, many with no response for nearly a year[1]. I want no part of any of this. Is there a way to tell Guix to never download source from SWH? Also no, and it's probably best to do this at the network level on your systems/network if you want this to be the case. I’ll investigate this, though I’d prefer if there was a way to configure source mirrors in the Guix daemon. Skipping back to this though: I was also distressed to see how poorly they treated a developer who wished to update their name: https://cohost.org/arborelia/post/4968198-the-software-heritag https://cohost.org/arborelia/post/5052044-the-software-heritag This is probably worth thinking about as Guix is in a similar situation regarding publishing source code, and people potentially wanting to change historical source code both in things Guix packages and Guix itself. Like Software Heritage, there's cryptographical implications for rewriting the Git history and modifying source tarballs or nars that contain source code. We have 17TiB of compressed source code and built software stored for bordeaux.guix.gnu.org now and we should probably work out how to handle people asking for things to be removed or changed (for any and all reasons). It's probably worth working out our position on this in advance of someone asking. Yes, I agree that Guix needs a better solution for this. Thanks, — Ian [1]: https://github.com/bigcode-project/opt-out-v2/issues
Concerns/questions around Software Heritage Archive
Hi Guixy people, I’d never heard of SWH before I started hacking on Guix last fall, and it struck me as rather a good idea. However, I’ve seen some things lately which have soured me on them. They appear to be using the archive to build LLMs: https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/ I was also distressed to see how poorly they treated a developer who wished to update their name: https://cohost.org/arborelia/post/4968198-the-software-heritag https://cohost.org/arborelia/post/5052044-the-software-heritag GPL’d software I’ve created has been packaged for Guix, which I assume means it’s been included in SWH. While I’m dealing with their (IMO: unethical) opt-out process, I likely also need to stop new copies from being uploaded again in the future. Is there a way to indicate, in a Guix package, that it should *never* be included in SWH? Is there a way to tell Guix to never download source from SWH? I want absolutely nothing to do with them. Thanks, — Ian
Re: Proposal to turn off AOT in clojure-build-system
Hello, I’ve been following along with this discussion, as well as a discussion on Clojureverse, and thought it might be helpful to pull together some threads and design decisions around Clojure’s behavior. Clojure is designed to ship libraries as source artifacts, not bytecode ("pretty much all other Clojure libraries ... are all source code by design[1]."; "Clojure is ... a source-first language[2]"), and the view of the community is that shipping AOT artifacts "is an anti-pattern[1]." Clojure library JARs are more akin to source tarballs than binaries. The original design and intent of Clojure’s AOT compiler is to compile "just a few things... for the interop case" or "Everything... For the 'Application delivery', 'Syntax check', and 'reflection warnings' cases[3]." Clojure’s compiler is transitive and "does not support separate compilation"[3], meaning when a namespace is compiled, anything it uses is compiled and emitted with it. This is the crux of why mixing AOT and non-AOT code is troublesome: it causes dependency diamonds, where the AOT’d library contains a duplicate, older version of code used elsewhere in the project. The Clojure reference on compiling[4] gives some reasons you might want to AOT: "To deliver your application without source," "To speed up application startup," "To generate named classes for use by Java," "To create an application that does not need runtime bytecode generation and custom classloaders." Note that there’s no mention of compiling libraries for any reason; only applications. When AOT is used "for the interop case," it’s typical to AOT only those namespaces[5], not the entire library. Shipping AOT-compiled Clojure libraries has caused real and very weird and hard-to-debug problems in the past: https://clojure.atlassian.net/browse/CLJ-1886?focusedCommentId=15290 https://github.com/clj-commons/byte-streams/issues/68 and https://clojure.atlassian.net/browse/CLJ-1741 Clojure doesn’t have guarantees around ABI stability[6][7]. To date, most ABI changes have been additive, but there are no guarantees that the ABI will be compatible from any one version of Clojure to any other. The understanding of the Clojure community is that the design of the current compiler can’t offer a stable ABI[8] at all. Because nobody in the Clojure community AOTs intermediate (that is, library) code, this hasn’t been a problem and is unlikely to change. "Clojure tries very hard to provide source compatibility but not bytecode compatibility across versions[9]." Correctly handling the ABI concerns — which Guix currently does not do — would result in a combinatorial explosion of Clojure packages should multiple versions of Clojure ever be available in Guix at the same time. For example, if someone wanted to package Clojure 1.12.0-alpha9, you’d need to duplicate every package taking Clojure as an input so they use the correct version. While ABI breakage has been rare thus far, it seems likely that it’ll occur at some point; perhaps if Clojure reaches version 2.0.0. If Guix disables AOT for Clojure libraries, we have source compatibility, and the AOT/ABI problems are moot. Clojure’s compiler is non-deterministic[10]: the same compiler can will produce different bytecode for the same input across multiple runs. I’m not sure if this is a problem for Guix at this point in time, but it seems out of line with Guix expectations for compilation generally. Opinions follow: If we’re taking votes, mine is to *not* AOT Clojure libraries, both for the technical reasons laid out in, and also for the social reason of not violating the principle of least surprise. I understand that Guix and Clojure have very different approaches, and some balance must be struck. However, the lack of ABI guarantees, the compiler’s behavior, the promise of source compatibility, and matching the expectation of the audience these tools are meant for all convince me that disabling AOT is the right course here. AOT’ing Clojure applications (which means, more or less, "the Clojure tooling") is desirable, and should be maintained. — Ian [1]: https://clojureverse.org/t/should-linux-distributions-ship-clojure-byte-compiled-aot-or-not/10595/8 [2]: https://clojureverse.org/t/should-linux-distributions-ship-clojure-byte-compiled-aot-or-not/10595/30 [3]: https://clojure.org/reference/compilation [4]: https://archive.clojure.org/design-wiki/display/design/Transitive%2BAOT%2BCompilation.html [5]: https://clojure.org/guides/deps_and_cli#aot_compilation [6]: https://clojureverse.org/t/should-linux-distributions-ship-clojure-byte-compiled-aot-or-not/10595/30 [7]: https://gist.github.com/hiredman/c5710ad9247c6da12a99ff6c26dd442e [8]: https://clojureverse.org/t/should-linux-distributions-ship-clojure-byte-compiled-aot-or-not/10595/4 [9]: https://clojureverse.org/t/should-linux-distributions-ship-clojure-byte-compiled-aot-or-not/10595/18
Re: Guix System automated installation
Hi Giovanni, Giovanni Biscuolo writes: [[PGP Signed Part:Undecided]] Hello Ian, I'm a little late to this discussion, sorry. I'm adding guix-devel since it would be nice if some Guix developer have something to add on this matter, for this reason I'm leaving all previous messages intact Csepp writes: Ian Eure writes: Hello, On Debian, you can create a preseed file containing answers to all the questions you’re prompted for during installation, and build a new install image which includes it. When booted, this installer skips any steps which have been preconfigured, which allows for either fully automated installation, or partly automated (prompt for hostname and root password, but otherwise automatic). Does Guix have a way to do something like this? The declarative config is more or less the equivalent of the Debian preseed file, but I don’t see anything that lets you build an image that’ll install a configuration. When using the guided installation (info "(guix) Guided Graphical Installation"), right before the actual installation on target (guix system init...) you can edit the operating-system configuration file: isn't it something similar to what you are looking for? Please consider that a preseed file is very limited compared to a full-fledged operating-system declaration since the latter contains the declaration for *all* OS configuration, not just the installed packages. I appreciate where you’re coming from, I also like the one-file system configuration, but this is inaccurate. Guix’s operating-system doesn’t encompass the full scope of configuration necessary to install and run an OS; Debian’s preseed has significantly more functionality than just specifying the installed packages. Right now, Debian’s system allows you to do things which Guix does not. Preseed files contain values that get set in debconf, Debian’s system-wide configuration mechanism, so they can both configure the resulting system as well as the install process itself. This means you can use a preseed file to tell the installer to partition disks, set up LUKS-encrypted volumes (and specify one or more passwords for them), format those with filesystems, install the set of packages you want, and configure them -- though debconf’s package configuration is more limited, generally, than Guix provides[1]. With Debian, I can create a custom installer image with a preseed file, boot it, and without touching a single other thing, it’ll install and configure the target machine, and reboot into it. That boot-and-it-just-works experience is what I want from Guix. For things that can’t be declared in operating-system, like disk partitioning and filesystem layout, the installer performs those tasks imperatively, then generates a system config with those device files and/or UUIDs populated, then initializes the system. There’s no facility for specifying disk partitioning or *creating* filesystems in the system config -- it can only be pointed at ones which have been created already. guix system image is maybe closer, but it doesn’t automate everything that the installer does. But the installer can be used as a Scheme library, at least in theory. The way I would approach the problem is by creating a Shepherd service that runs at boot from the live booted ISO. I would really Love So Much™ to avoid writing imperative bash scripts and just write Scheme code to be able to do a "full automatic" Guix System install, using a workflow like this one: 1. guix system prepare --include preseed.scm disk-layout.scm /mnt where disk-layout.scm is a declarative gexp used to partition, format and mount all needed filesystems the resulting config.scm would be an operating-system declaration with included the contents of preseed.scm (packages and services declarations) 2. guix system init config.scm /mnt (already working now) ...unfortunately I'm (still?!?) not able to contribute such code :-( I don’t think there’s any need for a preseed.scm file, and I’m not sure what would be in that, but I think this is close to the right track. Either operating-system should be extended to support things like disk partitioning, and effect those changes at reconfigure time (with suitable safeguards to avoid wrecking existing installs), or the operating-system config could get embedded in another struct which contains that, similar to the (image ...) config for `guix system image'. I think there are some interesting possibilities here: you could change your partition layout and have Guix resize them / create new ones for you. — Ian [1]: A workaround for this is to create packages which configure the system how you want, then include them on the installer image / list them in the packages to be installed. Not ideal, but you can.
Re: QA is back, who wants to review patches?
Christopher Baines writes: [[PGP Signed Part:Undecided]] Hey! After substitute availability taking a bit of a dive recently, the bordeaux build farm has finally caught back up and QA is back submitting builds for packages changed by patches. QA also has a feature to allow easily tagging patches (issues) as having been reviewed and ready to merge (reviewed-looks-good). You can do this via sending an email and QA has a form ("Mark patches as reviewed") on the page for each issue to help you do this. I'd encourage anyone and everyone to review patches, there's no burden on you to spot every problem and you don't need any special knowledge. You just need to not be involved (so you can't review your own patches) and take a good look at the changes, mentioning any questions that you have or problems that you spot. If you think the changes look good to be merged, you can tag the issue accordingly. When issues are tagged as reviewed-looks-good, QA will display them in dark green at the top of the list of patches, so it's on those with commit access to prioritise looking at these issues and merging the patches if indeed they are ready. Let me know if you have any comments or questions! Wanted to check things out, but it’s giving the same error message on every page: An error occurred Sorry about that! misc-error #fvector->list: expected vector, got ~S#f#f Also, the certificate for issues.guix.gnu.org expired today. Is there a plan to improve the reliability Guix infrastructure? It seems like major things break with alarming regularity. — Ian
Re: Guix CLI, thoughts and suggestions
Hi Carlo, Thank you for the thoughtful reply. Carlo Zancanaro writes: Hi Ian, Much of what you've written is fair, and I'm sure that Guix's commands could be better organised. I'm not really involved in Guix development, but I think there are two "inconsistencies" that you've mentioned which can be explained. On Mon, Jan 15 2024, Ian Eure wrote: Some examples of where I think Guix could do better. This is an illustrative list, not an exhaustive one. Inconsistent organization = Most package-related commands are under `guix package', but many are sibling commands. Examples are `guix size', `guix lint', `guix hash', etc. I think the real inconsistency here is that `guix package' is poorly named. This command really operates on profiles, and performs operations (install, remove, list, etc.) on those profiles. Packages are given as arguments to this command. The other commands operate on, and show the properties of, packages. Similarly with `guix build'. Yes, I agree the behavior makes a bit more sense from that viewpoint. However, it does have non-profile-related things in it, such as `--show' and `--search'. This is getitng into another thing I’ve seen a bit of, which is overloaded commands -- ones that do multiple things that are unrelated or tangentally related. But, I didn’t have a good example, and my message was long enough already. Inconsistency between verbs and options === ... For example, installing a package is `guix package -i foo' rather than `guix package install foo', removing is `guix package -r foo' rather than `guix package remove foo', and listing installed packages is `guix package -I' rather than `guix package installed' (or similar). The specific example of `guix package' might be explained by considering it as a single transaction to update the profile. The command `guix package' really says "perform a transaction on the profile", and the options are the commands in the transaction. Since there can be multiple commands, and the command names look like package names, they are provided as options. This doesn't fully explain the behaviour. In particular the example you give: This means that users can express commands which *seem* like they should work, but do not. For example `guix package -i emacs -r emacs-pgtk -I' represents a command to 1) install emacs 2) remove emacs-pgtk 3) list installed packages (which would verify the previous two operations occurred). ... seems reasonable to have working within the view of `guix package' as a transactional operation. I agree that this would make sense, but my understanding is that `guix package' doesn’t work like that -- it only performs the final operation in the list. IMO, it should either do *everything* the commands specify, or print an error and take no action. It's also worth noting that there are convenience shortcuts in `guix install' and `guix remove'. It seems like a lot of work to change, and backwards compatibility also is an issue. I see backwards compatibility as the main issue here. There was a lot of discussion preceding the inclusion of `guix shell', because of the prospect of breaking existing tutorials/documentation floating around on the internet. This is an even bigger concern for a more drastic reorganisation of the CLI. I agree, I don’t think the situation can be improved without finding a solution to preserve BC. But, I didn’t think it was worth making detailed plans for any of this before gauging whether the problem was one broadly considered to be worth solving. — Ian
Guix CLI, thoughts and suggestions
Greetings, As I’ve been learning Guix, one of the things I’ve found somewhat unpleasant is the lack of consistency within the guix CLI tool. It feels a bit Git-like, with not much consistency, commands that non-obvioulsy perform more than operation, related commands in different places in the tree, etc. Just so you know where I’m coming from: I’ve found that compliex CLI tooling benefits from organization and consistency. The Linux ip(8) command is a good example of this kind of organization: to add an IP address, you use `ip address add'. To show address, `ip address show', and to remove one `ip address del'. When options are needed, they get added after the verb or branch in the verb tree; the final verb may take positional arguments as well as --long or -s (short)-form options. Some examples of where I think Guix could do better. This is an illustrative list, not an exhaustive one. Inconsistent organization = Most package-related commands are under `guix package', but many are sibling commands. Examples are `guix size', `guix lint', `guix hash', etc. Inconsistency between verbs and options === Some verbs are bare-word positional arguments, and others are flags to related verbs. IMO, this is the biggest problem, and makes it very difficult to find all the things the CLI can do. `guix package' is a major offender in this area, as it mixes verbs and verb-specific options into the same level. For example, installing a package is `guix package -i foo' rather than `guix package install foo', removing is `guix package -r foo' rather than `guix package remove foo', and listing installed packages is `guix package -I' rather than `guix package installed' (or similar). This means that users can express commands which *seem* like they should work, but do not. For example `guix package -i emacs -r emacs-pgtk -I' represents a command to 1) install emacs 2) remove emacs-pgtk 3) list installed packages (which would verify the previous two operations occurred). This is a valid command within the accepted organization of `guix package', and doesn’t cause an error, but doesn’t work: the install and remove steps are ignored. A thing I’ve found throughout my career is that designing systems so it’s *impossible* to represent unsupported, nonsensical, or undefined things is an extremely valuable technique to avoid errors and pitfalls. I think Guix could get a lot of mileage out of adopting something similar. This causes a related problem of making it impossible to know what options are valid for what verbs. Will `guix package --cores=8 -r emacs' remove the package while using eight cores of my system? Will `guix system -s i686 switch-generation 5' switch me to a 32-bit version of generation 5? If verbs are organized better, and have their own options, this ambiguity vanishes. More inconsistency == Other parts of guix have the opposite problem: `guix system docker-image' probably ought to be an option to `guix system image' rather than a separate verb. Inconsistency between similar commands == There are generations of both the system (for GuixSD) and the user profile, however, they work differently. For the system, there’s `guix system list-generations' and `guix system switch-generation', but for the user profile, you need `guix package --list-generations' and `guix package --switch-generation=PATTERN'. Additionally, no help is available for either of the system commands: `guix system switch-generations --help' gives the same output as `guix system --help' -- no description of the supported ways of expressing a generation are available. Flattened verbs === Related, the generation-related commands under `guix system' ought to be one level deeper: `guix system generation list', `guix system generation switch' etc. Repeated options Many commands (`guix package', `guix system', `guix build', `guix shell') take -L options, to add Guile source to their load-path. This probably ought to be an option to guix itself, so you can do `guix -L~/src/my-channel build ...'. Suggestions === All commands should be organized into a tree of verbs. Verbs should have common aliases (`rm' for `remove', etc). Verbs should be selected by specifying the minimum unambiguous substring. For example `guix sys gen sw' could refer to `guix system generation switch'. Options should be applicable to each level of the tree, ex `guix -L~/src/my-channel' would add that load-path, which would be visible to any command. Requesting help is a verb. Appending "help" to any level of the verb tree should show both options applicable to that verb, and its child verbs. `guix help' would show global options and all top-level verbs (package, system, generation, etc); `guix package help' would show
Anyone working on more recent glib/gtk4 packages?
Hello, I wanted to package Fractal, which is a native GNOME client for Matrix chat. It requires newer versions of glib and gtk than are currently in Guix. I believe I’ve seen in IRC that some folks are working on getting GNOME 43/44 packages done, which probably needs the glib/gtk updates to happen. If there’s work in this direction, could someone point me to it? Thanks, — Ian