Hi all,
Committing LLM-generated code is clearly not well received by everyone,
so I won't make any Pull Requests with LLM-generated code.
Which means I probably make less Pull Requests, as I want to learn more
about what LLM's can and cannot do. So my 'free time' will go to that
for the time being.
To expand on the experiment a bit more.
On 2/20/26 02:51, pelzflorian (Florian Pelz) wrote:
Hugo Buddelmeijer <[email protected]> writes:
I think it did a pretty good job. It only took a few minutes per
package, and that was mostly due to build times and because it had to
ask permission for everything.
Stunning. Hard to believe. Thank you Hugo. How much input did you
give? So Codex automatically reads build logs
like on Google I find
https://bordeaux.guix.gnu.org/build/6f862905-fe58-488e-81f2-c68228175614/log
and tries adequate imitations of prior Guix commits until it runs?
First step was to get it to find a broken package. Codex apparently
could not download it itself, so I downloaded the json file with broken
packages and gave it that with:
3. `There is a file jobs.json that lists the build status of all
packages. Can you pick a random failing package and fix it?`
Then it did everything autonomously, asking each time it wanted to run a
shell command. Essentially it ran `guix build -K <package>`, read the
output, create a fix, and then ran `guix build` again. I think it got
the right fix immediately each time.
It did the random selection of a broken package using jq and shuf.
I had to coach it a bit to make correct commit messages though, and
ultimately failed, so I had to edit all those by hand.
It also offered to push the branch it made and create a P.R., which I
declined.
I also had to ask it to first try to refresh the package before patching
it. It learned on-the-fly how `guix refresh` worked by running `guix
refresh --help`.
For me, this was a useful experiment in this larger discussion. I don't
yet know what my conclusions are though.
Hugo