On 4/6/25 09:15, Daniel P. Berrangé wrote:
On Wed, Jun 04, 2025 at 08:17:27AM +0200, Markus Armbruster wrote:
Stefan Hajnoczi <stefa...@gmail.com> writes:
On Tue, Jun 3, 2025 at 10:25 AM Markus Armbruster <arm...@redhat.com> wrote:
From: Daniel P. Berrangé <berra...@redhat.com>
>> +
+The increasing prevalence of AI code generators, most notably but not limited
More detail is needed on what an "AI code generator" is. Coding
assistant tools range from autocompletion to linters to automatic code
generators. In addition there are other AI-related tools like ChatGPT
or Gemini as a chatbot that can people use like Stackoverflow or an
API documentation summarizer.
I think the intent is to say: do not put code that comes from _any_ AI
tool into QEMU.
It would be okay to use AI to research APIs, algorithms, brainstorm
ideas, debug the code, analyze the code, etc but the actual code
changes must not be generated by AI.
The scope of the policy is around contributions we receive as
patches with SoB. Researching / brainstorming / analysis etc
are not contribution activities, so not covered by the policy
IMHO.
The existing text is about "AI code generators". However, the "most
notably LLMs" that follows it could lead readers to believe it's about
more than just code generation, because LLMs are in fact used for more.
I figure this is your concern.
We could instead start wide, then narrow the focus to code generation.
Here's my try:
The increasing prevalence of AI-assisted software development results
in a number of difficult legal questions and risks for software
projects, including QEMU. Of particular concern is code generated by
`Large Language Models
<https://en.wikipedia.org/wiki/Large_language_model>`__ (LLMs).
Documentation we maintain has the same concerns as code.
So I'd suggest to substitute 'code' with 'code / content'.
Why couldn't we accept documentation patches improved using LLM?
As a non-native English speaker being often stuck trying to describe
function APIs, I'm very tempted to use a LLM to review my sentences
and make them better understandable.
If we want to mention uses of AI we consider okay, I'd do so further
down, to not distract from the main point here. Perhaps:
The QEMU project thus requires that contributors refrain from using AI code
generators on patches intended to be submitted to the project, and will
decline any contribution if use of AI is either known or suspected.
This policy does not apply to other uses of AI, such as researching APIs or
algorithms, static analysis, or debugging.
Examples of tools impacted by this policy includes both GitHub's CoPilot,
OpenAI's ChatGPT, and Meta's Code Llama, amongst many others which are less
well known.
The paragraph in the middle is new, the other two are unchanged.
Thoughts?
IMHO its redundant, as the policy is expressly around contribution of
code/content, and those activities as not contribution related, so
outside the scope already.
+to, `Large Language Models
<https://en.wikipedia.org/wiki/Large_language_model>`__
+(LLMs) results in a number of difficult legal questions and risks for software
+projects, including QEMU.
Thanks!
[...]
With regards,
Daniel