On 4/8/26 12:20, Ludovic Courtès wrote:
Hi,

Hugo Buddelmeijer via <[email protected]> skribis:

On 4/7/26 01:29, Dr. Arne Babenhauserheide wrote:

[...]

Are they more reliable than
      ./pre-inst-env guix refresh -u PACKAGE

LLMs are much more reliable than guix refresh,

‘guix refresh’ checks upstream for the newer versions and updates
versions and hashes; LLMs can do many things (including proposing
changes based on what they learned from the Guix repo) but that’s
something they cannot do, by definition.

Then again, an agent, rather than an LLM, might be able to emulate what
‘guix refresh’ is doing through trial and error, but since we already
have the tool to do it right, I don’t understand what this would buy us.

I was not clear.

Firstly, I used the term LLM and agent interchangeably, even though I prefer agent, because the fact that agents use LLMs is not the point. (As in, the term 'agent' would still apply if the LLM part would be switched for something else.)

Secondly, the idea was to use agents in addition to guix refresh, not as a substitute. E.g., in my experiment, I told my agent to try to refresh broken packages before trying to patch them. (It was quite interesting to see it figure out how to use the various guix commands; it even RTFM'ed.)

The time consuming part of refreshing packages are the ones that break (or their dependents that break). Agents work extremely well for broken packages. guix refresh can already do simple things like adding new inputs, but it can't write substitutions and such, and I can't see how it ever can.

It often takes only a minute for an agent to fix a broken package, while it easily takes 20 minutes for a human, sometimes much more.

E.g. with 6000 packages depending on Python, there will be 12000 potential breakages when going from Python 3.12 to 3.13 to 3.14; and the experience of 3.11 to 3.12 shows that it can indeed be hundreds of packages that break.

That can lead to hundreds of hours of work. I'm willing to do that work. But I find it very hard to justify spending hours doing something by hand that can trivially be automated by a machine. And there will be more than enough manual work left.

My Guix-wishlist is huge, yet I spent my time writing substitute* calls and adding CFLAGS. Clanker work.

I engaged here to find out a justification for doing that manual labor and have not found it yet. (Other than the threat of not having code merged otherwise, a very bad motivator for me.)

There are some problems with using agents yes, and I'm more than willing to have a conversation about those problems, because agentic coding is a big subject that requires more than one person to think through. At the moment, I think there is plenty of opportunity to use agentic coding while preserving, or even enhancing, our ideals.

To clarify: in my mind it is essential to keep the human in the loop, especially for a project like Guix. Humans should maintain 'ownership' of the code they contribute, humans should review and understand all of their code, humans are responsible that their code is correct, humans should do the communication with other humans, etc. For me this is implied, but I learned it is good to write this out.

Hugo


  • Re: AI Ludovic Courtès
  • Re: AI Hugo Buddelmeijer via
  • Re: AI Dr. Arne Babenhauserheide
  • Re: AI Hugo Buddelmeijer via
  • Re: AI Dr. Arne Babenhauserheide
  • Re: AI Hugo Buddelmeijer via
  • Re: AI Dr. Arne Babenhauserheide
  • Re: AI Ludovic Courtès
  • Re: AI Igorj Gorjaĉev via
  • Re: AI Ludovic Courtès
  • Re: AI Hugo Buddelmeijer via
  • Re: AI pelzflorian (Florian Pelz)
  • Re: AI Ian Eure
  • Re: AI Hugo Buddelmeijer via

Reply via email to