Heya, thanks for drafting this. I agree with the sentiment already
stated elsewhere in the thread that "it is too soon for a GR". But in
the meantime here are a few minor comments and further reads relevant
for your text.

On Wed, Feb 18, 2026 at 11:37:51PM +0100, Lucas Nussbaum wrote:
> 1. some bits are heavily based on:
>     * Linux Foundation -- Guidance Regarding Use of Generative AI Tools
>       for Open Source Software Development
>       https://www.linuxfoundation.org/legal/generative-ai
>     * Fedora -- AI-assisted Contributions Policy
>       
> https://docs.fedoraproject.org/en-US/council/policy/ai-contribution-policy/
>     * Matplotlib -- Restrictions on Generative AI Usage
>       
> https://matplotlib.org/devdocs/devel/contribute.html#restrictions-on-generative-ai-usage

In the various recent threads on the broad "AI" topic in Debian, several
related works like the above have been listed. To my surprise, nobody
mentioned the recent documents form the Linux kernel community. Hence
here they are, because I found them to be both very interesting and
relevant for our discussion:

- Kernel Guidelines for Tool-Generated Content:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/generated-content.rst

- AI Coding Assistants:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-assistants.rst

Yes, I understand Lucas that your proposal is *not* inspired by these,
so it's normal you didn't reference them :-) Still, I think it'd be
useful for the Debian community to read these documents. I also use
these documents below, for comparison.

> 1. **Legal Compatibility:** Contributors should ensure that the terms and
>    conditions of the generative AI tool do not impose contractual
>    restrictions that conflict with the distribution, modification, or use of
>    the output in the context of Debian.

(nitpick) I agree with this one, but the title isn't great as it makes
people (or me at least) think of license/copyright compatibility issues
on the output, whereas it is more specific---as clearly described in the
expanded version. How about "legal use" or "permitted use"? They are not
great either. Maybe someone else will come up with a better suggestion.

> 2. **Licensing and Attribution:** If any pre-existing copyrighted materials
>    (including pre-existing code licensed as free software) authored or owned
>    by third parties are included in the AI tool’s output, prior to
>    contributing such output to the project, the contributor should verify
>    that such materials are available under a compatible license.
>    Additionally, the contributor should provide notice and attribution of
>    such third party rights, along with information about the applicable
>    license terms, with their contribution.

I agree with this one too, but I find the wording to be too strong,
because it implies an active action on contributors' part *for each use
of a given tool*. We know that that is not going to happen. What is more
realistic is that contributors will vet *tools* for this (for example,
some coding assistants have integrated code clone detection / plagiarism
detection / provenance detection capabilities that would help there and
that can be enabled once and for all). Any remaining license violation
(if at all) can be caught and fixed downstream, as it already happens
today for license violations introduced in the archive for any other
reason.

For comparison, the Linux kernel guidelines on this topic say "All code
must be compatible with GPL-2.0-only". Note how the wording is on the
result, without saying what contributors should do at each use.

> 3. **Accountability:** Contributors assume full responsibility for their
>    contributions, including vouching for the technical merit, security,
>    license compliance, and utility of their submissions. The contributor
>    remains solely accountable for the entirety of these contributions.
>    Contributors should fully understand the proposed changes and be prepared
>    to justify them.
> 
> 4. **Explicit Disclosure:** When a significant portion of the contribution
>    is taken from a tool without manual modification, contributors should
>    disclose the tool's use. This may be recorded using Git trailers, such as
>    `Generated-By:` or `Assisted-By:`.

Again, for comparison purposes, here's what the Kernel community comes
up with about these two:

  > Signed-off-by and Developer Certificate of Origin
  > =================================================

  > AI agents MUST NOT add Signed-off-by tags. Only humans can legally
  > certify the Developer Certificate of Origin (DCO). The human submitter
  > is responsible for:

  > * Reviewing all AI-generated code
  > * Ensuring compliance with licensing requirements
  > * Adding their own Signed-off-by tag to certify the DCO
  > * Taking full responsibility for the contribution

  > Attribution
  > ===========

  > When AI tools contribute to kernel development, proper attribution
  > helps track the evolving role of AI in the development process.
  > Contributions should include an Assisted-by tag in the following format::

  >   Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]

  > Where:

  > * ``AGENT_NAME`` is the name of the AI tool or framework
  > * ``MODEL_VERSION`` is the specific model version used
  > * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
  >   (e.g., coccinelle, sparse, smatch, clang-tidy)

  > Basic development tools (git, gcc, make, editors) should not be listed.

  > Example::

  >   Assisted-by: Claude:claude-3-opus coccinelle sparse

Hope this helps,
Cheers
-- 
Stefano Zacchiroli - https://upsilon.cc/zack
Full professor of Computer Science, Polytechnic Institute of Paris
Co-founder & CSO Software Heritage

Reply via email to