+1 I think it could be improved by taking into this concern:
All contributions that were produced using assisted coding tools must make that clear as part of the contribution, to simplify the role of the reviewer and future maintainance of the code. I think this is consistent with normal traditional ethical behaviour to clarify the origin of contributions. This is not that different from a git commit saying 'Run indent on code', or 'Update copyright years' or some other mechanic tool-generated code update. I worry that this may limit some reasonable uses of LLM coding assistants: test code. Those tools can be used to generate large chunks of boring test code. Test code is often harmless, and either they PASS or FAIL and we can test this continously. I see some value in allowing that, but it would be out of scope of the policy below. The attack surface for test code is much smaller than actual running code. /Simon Bruno Haible via Gnulib discussion list <[email protected]> writes: > Hi, > > I would like to propose a policy regarding LLM regenerated code in Gnulib. > > The reason is that on one hand > - Since the beginning of 2025, there has been a trend for "vibe coding" [1], > - Even famous people like Linus Torvalds make use of it. > and on the other hand, there are issues with it, in particular > - Copyright and license issues: How to identify regurgitated copyrighted > code? > - Maintainability: Code never reviewed by a human programmer, larger > than and possibly less well commented than what a human programmer > would produce. > [1] https://en.wikipedia.org/wiki/Vibe_coding > > Outside of the scope of this proposal are uses of LLMs that don't generate > code. For these cases, it is already well-known that you need to fact-check > the LLM's answers. > > Here's a proposed addition to the HACKING file. > > ================================================================================ > > Acceptable use of LLM generated code > ==================================== > > General-purpose LLMs as well as LLMs specialized for software programming > can produce ready-to-use and, in many cases, actually working code. > > We need to avoid two problems with that: > > * Copyright and license issue: An LLM may regurgitate a piece of copyrighted > code without the copyright header, thus violating the code's license. > (Most code licenses require that the copyright header remains intact when > the code is copied or becomes the basis of derivative works.) > > * Maintainability issues: Such generated code has initially not been > reviewed by a human programmer. It is often greater in size than what a > careful programmer would write. Sometimes it also lacks comments. > People who use "vibe coding" often also observe that the code is of > lower quality. > Where software in general can be qualified as for long-term use vs. > short-term use, vibe coding tends to be more suitable for short-term used > software. > > To this end: > > 1) Code included in this package that comes from a single LLM prompt > must be limited in size: it must be at most 5 lines long. > > 2) As a submitter, you assert that you have reviewed such code that you > submit. > > Rule 1 guarantees that the LLM generated code size is smaller than the > "legally significant for copyright purposes" threshold, see > https://www.gnu.org/prep/maintain/html_node/Legally-Significant.html > > Rule 2 encourages you to not submit unreviewed garbage. > > ================================================================================ > > Related policies: > * Linux > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/generated-content.rst > https://lwn.net/Articles/1032612/ > * Asahi Linux > https://asahilinux.org/docs/project/policies/slop/ > * FreeBSD > > https://www.heise.de/en/news/FreeBSD-policy-AI-generated-source-code-No-thanks-10634141.html > * LLVM > https://github.com/llvm/llvm-project/blob/main/llvm/docs/AIToolPolicy.md > > Let us know what you think. > > Bruno > > > > >
signature.asc
Description: PGP signature
