Michał Górny <[email protected]> writes:
> Hello,
>
> Recently we've seen quite a grow in LLM usage in upstream packages, to
> various degrees. The exact degree differs, apparently ranging from
> using LLMs as glorified autocomplete to asking them to rewrite the whole
> project or being convinced that a chatbot is sentient. The results are
> raising some concerns. I've originally written to the distributions ml
> [1] to facilitate some feedback on how other distributions are handling
> it, but the major players didn't reply.
>
> Our "AI policy" [2] covers only direct contributions to Gentoo. At the
> time, we did not consider it appropriate to start restricting what
> packages are added to Gentoo. Still, general rules apply and some of
> the recent packages are starting to raise concerns there. Hence I'm
> looking for your feedback.
>
> Two recent cases that impacted the Python team were autobahn and
> chardet.
>
> In case of autobahn, the author started using two LLMs to write major
> portions of code, as well as reply to bugs. This caused some "weird"
> bugs [3,4,5], which eventually lead me to ask upstream to reconsider
> which met with quite a backlash [6]. While I haven't checked the code,
> judging by the kind of slop that (almost) got committed as a result of
> my bug reports, we've decided that it's not safe to package the new
> versions and autobahn was deprecated in Gentoo. Still, it has reverse
> dependencies and I don't really have the energy to go around convincing
> people to move away from it.
>
> In case of chardet, one of the current maintainers suddenly decided to
> ask LLM to rewrite the whole codebase under a different license,
> therefore obliterating all past commits. This triggered a huge
> shitstorm [7], in which the maintainer pretty much admitted he did this
> deliberately and was pretty much a complete asshole about it (basically,
> "I've been co-maintaining this for N years, nobody was doing anything,
> so I've used a plagiarism machine to rewrite the whole thing and claim
> it as my own work"). I haven't read most of it, given that the same
> things keep repeating: mostly GPL haters cosplaying lawyers. We've also
> decided not to package the new version, given the ethical and licensing
> concerns.
>
> However, these aren't the only concerning packages. As a result of
> chardet rewrite, binaryornot has replaced it with its own code which in
> which LLMs were quite likely to have been involved [8]. Linux kernel
> apparently uses them too [9]. KeePassXC too, triggering security
> concerns [10]. So does vim [11].
>
> The key problem is, how do we decide whether to package something or
> not? We definitely don't have the capability of inspecting whatever
> crap upstream may be committing. Of course, that was always a risk, but
> with LLMs around, things are just crazy. And we definitely can't stick
> with old versions forever.
>
> The other side of this is that I have very little motivation to put my
> human effort into dealing with random slop people are pushing to
> production these days, and reporting issues that are going to be met
> with incomprehensible slop replies.
>
>
> [1]
> https://lore.kernel.org/distributions/[email protected]/
> [2] https://wiki.gentoo.org/wiki/Project:Council/AI_policy
> [3] https://github.com/crossbario/autobahn-python/issues/1716
> [4] https://github.com/crossbario/autobahn-python/issues/1735
> [5] https://github.com/crossbario/autobahn-python/issues/1782
> [6] https://github.com/crossbario/autobahn-python/discussions/1818
> [7] https://github.com/chardet/chardet/issues/327
> [8] https://github.com/binaryornot/binaryornot/releases/tag/v0.5.0
> [9] https://lwn.net/Articles/1026558/
> [10] https://github.com/keepassxreboot/keepassxc/issues/12635
> [11] https://hachyderm.io/@AndrewRadev/116175986749599825
If it were possible, I would personally enjoy having these packages
marked in some way like in a different profile or something.
That being said AI code is so widespread now that it is even in the
Linux kernel, so I think a system without any AI code is unrealistic
at this point.
- sova