Am Di., 10. März 2026 um 16:55 Uhr schrieb Michał Górny <[email protected]>:
>
> Hello,
>
> Recently we've seen quite a grow in LLM usage in upstream packages, to
> various degrees.  The exact degree differs, apparently ranging from
> using LLMs as glorified autocomplete to asking them to rewrite the whole
> project or being convinced that a chatbot is sentient.  The results are
> raising some concerns.  I've originally written to the distributions ml
> [1] to facilitate some feedback on how other distributions are handling
> it, but the major players didn't reply.
>
> Our "AI policy" [2] covers only direct contributions to Gentoo.  At the
> time, we did not consider it appropriate to start restricting what
> packages are added to Gentoo.  Still, general rules apply and some of
> the recent packages are starting to raise concerns there.  Hence I'm
> looking for your feedback.

I think we cannot avoid that AI is somehow involved in any commits
ending up in Gentoo, be it ebuilds, or be it packages. I'd argue that
it is almost unavoidable for the common developer to have at least AI
completion running in the code editor. I say "almost" and "common",
because if someone really cares, they certainly *can* avoid that.

For me, AI is a tool. For complex projects, I use it as a tool to
discuss changes and bugs, and it is really helpful. AI autocomplete
can be a real time saver. Yes, it has obvious and not so obvious
downsides, like environmental pollution, market price impacts,
employment impacts, licensing questions, a lot of other ethical
concerns... That's all bad, it has been previously discussed, and
that's not the point I want to argue.

For me, the real question is, how is AI used. From my perspective, you
can do a lot of good, and a lot of bad, and a lot of ugly with it.

>From personal experience, I'd say:

AI isn't bad per-se. But AI also doesn't create good code. Adding a
feature usually introduces one new feature, two regressions, and
includes three new bugs which the "AI didn't think of". If the
developer isn't able to understand what the AI creates, that person
shouldn't probably develop in the first place - no matter if AI is
used or not.

I have some non-open source projects for internal infrastructure which
I completely create using AI, mostly as an experiment, guiding the AI
agent in the chat, writing design documents and roadmaps, then letting
it implement it in small steps. It works great at the start, but it
gets out of control fast. Code becomes overly complex, the beauty of
simplicity is usually completely missing. So I know from personal
experience what it means to use AI as the primary developer, and I
developed a pretty good instinct of "this is a bug introduced by
coding solely with AI without understanding the code". One way around
it is to let the AI create the test *first*, and only then write the
code and fix it until the tests pass. But this also needs proper human
review of tests. But again, this become complex fast, too complex for
the agent to properly handle.

I'm also using AI agents to discuss code and bugs while developing,
and that is really helpful. It's that sort of thing I'd call external
memory. It's better at remembering what still needs to be done while
solving a complex situation. It helps getting back into the devel
session I've abandoned for like 2 or 3 weeks. But I never let it write
code when using it as that sort of tool, I only let it suggest code,
then I review it, write it down in my own style, then let the AI
review it, and that uncovers surprisingly many bugs that I wouldn't
have directly discovered myself, and even more surprisingly: It
uncovers bugs that the AI agent would have implemented itself if I had
just let it coded that part. AI agents are also good at searching the
code and finding the parts I'm looking for, it's one of the biggest
time savers. AI understands code good enough to ask it for "where is
this variable mutated in a way that it causes this or that bug", and
that actually helps me understand the code faster because I easier
learn which code to focus on. But I can also say that AI agents have
become a lot worse over the past few months, or I just started to
expect too much from it... :-D

LLMs are also really good at writing commit messages for code I
changed. English is not my native language, it helps a lot to get
clear and concise messages with good wording. It still needs manual
review because sometimes LLMs hallucinate the "why" of the change, but
still it's a great tool to save time.

So I'd say, - the above downsides not excluded from the discussion -
that it really depends on how that tool is used. We, in Gentoo,
probably cannot avoid that LLM is at least somewhat contributing to
code - it may just be AI code completion or commit texts or
documentation. As long as its usage is limited to that, it should be
okay to use.

But I'd also say: Fully AI generated code, or even worse, if not
properly reviewed by a human, does not belong in Gentoo infrastructure
code and tooling (portage, emerge, hosting infrastructure).AI just
isn't good at creating good code. It isn't good at creating well
behaving code. It isn't good at creating secure and safe code. It
isn't good at creating stable code. That's my observation from using
it as a coding agent.

But realistically, there isn't a chance for Gentoo to avoid every
possible AI or LLM assisted contribution to Gentoo through packages or
direct contributions.

Code needs human quality control and human review. I think that's one
quality requirement a package or contribution has to meet. If this is
guaranteed, AI generated code is a lesser concern - at least as long
as AI bots don't dominate the bug tracker (like it happens for curl).
But that's also a different discussion which is partly pointed out
below, but I'm not sure if it is part of the question here.


> Two recent cases that impacted the Python team were autobahn and
> chardet.
>
> In case of autobahn, the author started using two LLMs to write major
> portions of code, as well as reply to bugs.  This caused some "weird"
> bugs [3,4,5], which eventually lead me to ask upstream to reconsider
> which met with quite a backlash [6].  While I haven't checked the code,
> judging by the kind of slop that (almost) got committed as a result of
> my bug reports, we've decided that it's not safe to package the new
> versions and autobahn was deprecated in Gentoo.  Still, it has reverse
> dependencies and I don't really have the energy to go around convincing
> people to move away from it.

AI bots should not act on their own on bug reports. And AI bots should
not create bug reports. If that happens in a project, it raises a red
flag. But we need to distinguish on which behalf this happens. E.g.,
the curl developer has a lot of problems with AI bot generated bug
reports. It affects the project quality not because the project is bad
but because the manpower of handling this is missing. I think that
mostly covers the ethical question of why AI is bad. It's an example
of how AI is not used in a responsible way.


> In case of chardet, one of the current maintainers suddenly decided to
> ask LLM to rewrite the whole codebase under a different license,
> therefore obliterating all past commits.  This triggered a huge
> shitstorm [7], in which the maintainer pretty much admitted he did this
> deliberately and was pretty much a complete asshole about it (basically,
> "I've been co-maintaining this for N years, nobody was doing anything,
> so I've used a plagiarism machine to rewrite the whole thing and claim
> it as my own work").  I haven't read most of it, given that the same
> things keep repeating: mostly GPL haters cosplaying lawyers.  We've also
> decided not to package the new version, given the ethical and licensing
> concerns.
>
> However, these aren't the only concerning packages.  As a result of
> chardet rewrite, binaryornot has replaced it with its own code which in
> which LLMs were quite likely to have been involved [8].  Linux kernel
> apparently uses them too [9].  KeePassXC too, triggering security
> concerns [10].  So does vim [11].

Without having looked at the links, I think that's the point of "it
probably can't be avoided", and we have to find a proper way to deal
with it. I think human quality control is one way to do it. Of course,
Gentoo cannot take the burden of doing that as you outline below.


> The key problem is, how do we decide whether to package something or
> not?  We definitely don't have the capability of inspecting whatever
> crap upstream may be committing.  Of course, that was always a risk, but
> with LLMs around, things are just crazy.  And we definitely can't stick
> with old versions forever.

We can't, and we shouldn't. I think we could start with creating some
quality gates for packages: Do they work with AI? How do they work
with AI? What is their policy on AI usage? Maybe that's also something
which could be flagged inside ebuilds so people can decide "what
amount of AI impact" they want to accept. We could also add
alternative suggestions to such packages which can be used as
replacements. But that will certainly explode dependency resolution or
create completely unmaintable dependency trees. In the worst case, the
package simply has to go away, maybe move to a different ebuild tree,
something like GURU but for packages involving AI patterns in a yet to
be defined way?

But I also think that Gentoo's future is not denying that AI exists or
refusing AI involvement. It will become a tool that cannot be avoided,
and we have to deal with it in a reasonable way. AI/LLM is still an
early tool that humans have to learn to use correctly. Currently the
situation is: It's not used correctly.


> The other side of this is that I have very little motivation to put my
> human effort into dealing with random slop people are pushing to
> production these days, and reporting issues that are going to be met
> with incomprehensible slop replies.

It's exactly the human that suffers here. And if that happens, it's
also a red flag.


> [1] 
> https://lore.kernel.org/distributions/[email protected]/
> [2] https://wiki.gentoo.org/wiki/Project:Council/AI_policy
> [3] https://github.com/crossbario/autobahn-python/issues/1716
> [4] https://github.com/crossbario/autobahn-python/issues/1735
> [5] https://github.com/crossbario/autobahn-python/issues/1782
> [6] https://github.com/crossbario/autobahn-python/discussions/1818
> [7] https://github.com/chardet/chardet/issues/327
> [8] https://github.com/binaryornot/binaryornot/releases/tag/v0.5.0
> [9] https://lwn.net/Articles/1026558/
> [10] https://github.com/keepassxreboot/keepassxc/issues/12635
> [11] https://hachyderm.io/@AndrewRadev/116175986749599825

-- 
Best regards,
Kai

Reply via email to